Autonomous driving paper index

Stability-Aware Reinforcement Learning for Autonomous Driving With Dynamics-Augmented State and Lyapunov Constraints

2025-11-01 · IEEE Robotics and Automation Letters

End-to-End Autonomous Driving Autonomous Driving Simulation

autonomous drivingend-to-endreinforcement learningcarlaprediction

One-line summary

To overcome these limitations, we present an end-to-end reinforcement learning framework that incorporates a data-driven vehicle dynamics prediction model and Lyapunov-based stability constraints.

Engineering notes

Key topics: autonomous driving, end-to-end, reinforcement learning, carla, prediction. See the paper for implementation details and experimental results.

Chinese explanation / 中文解读

中文解读待补充：本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

Autonomous driving in extreme conditions presents substantial challenges in ensuring vehicle stability and safety. Traditional reinforcement learning (RL) methods for decision-making often lack vehicle dynamics modeling and formal stability constraints, leading to dynamically infeasible behaviors and unstable policy training. To overcome these limitations, we present an end-to-end reinforcement learning framework that incorporates a data-driven vehicle dynamics prediction model and Lyapunov-based stability constraints. The dynamics module is constructed using a hybrid Transformer architecture to effectively capture nonlinearities, time-varying parameters, and the coupling between longitudinal and lateral motions. This module captures the nonlinear interaction among vehicle, tire, and road, and provides predicted dynamic states as auxiliary inputs to enhance the RL state representation. Second, a neural network-based Lyapunov candidate function is incorporated into an enhanced Soft Actor–Critic (SAC) framework to impose stability-aware constraints on policy learning. To explicitly characterize lateral instability, the squared sideslip angle at the vehicle’s center of gravity is defined as the Lyapunov cost. In addition, a hierarchical reward function is designed to balance multiple objectives during policy learning. The proposed framework is then validated through open-loop prediction experiments using both simulated and real vehicle data, followed by closed-loop evaluation in the CARLA simulator under representative driving scenarios, including low-friction road and dynamic obstacle avoidance. Experimental results show that the proposed method leads to significant improvements in both the stability and safety of the learned policies.

5.5Engineering value

7.0Research novelty

5.0Business relevance

Links and sources

Official / arXiv page

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.