Autonomous driving paper index
Risk-Sensitive Distributional Proximal Policy Optimization for Safe Highway Lane-Change Decision-Making
One-line summary
To address this issue, this paper proposes a Risk-Sensitive Distributional Proximal Policy Optimization (PPO) method, termed Risk-Sensitive Distributional Proximal Policy Optimization (RSDPPO), for highway lane-changing decision-making.
Engineering notes
The results show that, under medium-density traffic, the proposed method outperforms representative baseline algorithms in cumulative reward, success rate, and safety reward.
Chinese explanation / 中文解读
中文解读待补充:本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。
Original abstract
Decision-making is a critical module for intelligent vehicles to achieve safe and efficient autonomous driving. However, most existing reinforcement learning-based decision-making methods optimize policies by maximizing the expected return, which may inadequately account for low-probability but high-cost safety risks in complex traffic interactions. To address this issue, this paper proposes a Risk-Sensitive Distributional Proximal Policy Optimization (PPO) method, termed Risk-Sensitive Distributional Proximal Policy Optimization (RSDPPO), for highway lane-changing decision-making. Within the PPO framework, a distributional state-value function is introduced to model the return distribution under the current policy, and a Wang distortion-based risk measure is further incorporated to construct a risk-sensitive advantage function. In this way, risk information contained in the return distribution can be propagated into the policy gradient update, guiding the learned policy to avoid high-risk driving behaviors while maintaining training stability. Simulation experiments are conducted in a highway lane-changing scenario with heterogeneous surrounding vehicles. The results show that, under medium-density traffic, the proposed method outperforms representative baseline algorithms in cumulative reward, success rate, and safety reward. Further evaluation under higher-density traffic demonstrates that RSDPPO maintains better overall performance, indicating stronger adaptability to denser traffic conditions. Ablation studies further show that risk-averse distortion improves the balance between safety and efficiency by increasing safety margins during car-following and lane-changing maneuvers. These results indicate that RSDPPO provides an effective risk-sensitive policy optimization framework for safety-oriented highway lane-changing decision-making.
Links and sources
Need this topic turned into a technical roadmap?
Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.
Request B2B research
Comments