Autonomous driving paper index
Simulation Study on Real-Time Autonomous Driving Decision-Making Using BEV Perception and Large Language Models
One-line summary
Large language models (LLMs) exhibit strong semantic reasoning capabilities for autonomous driving decision-making; however, their substantial inference latency poses a critical challenge for real-time closed-loop vehicle control.
Engineering notes
The results indicate that the system satisfies the 10 Hz real-time control requirement and significantly improves control quality, as evidenced by reduced collision rates and lower Average Jerk compared with both traditional imitation learning (Behavioral Cloning, BC) and the Transformer-based TransFuser baseline.
Chinese explanation / 中文解读
中文解读待补充:本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。
Original abstract
Large language models (LLMs) exhibit strong semantic reasoning capabilities for autonomous driving decision-making; however, their substantial inference latency poses a critical challenge for real-time closed-loop vehicle control. This study proposes an engineering-oriented framework to enable latency-constrained LLM-based decision-making by integrating bird’s-eye-view (BEV) structured perception with low-bit quantized inference. The BEV perception module compresses multi-view visual inputs into structured semantic representations, thereby reducing input redundancy and enhancing inference efficiency. In addition, 4-bit post-training quantization (PTQ), combined with an optimized inference engine, is employed to alleviate computational and memory bandwidth constraints during autoregressive decoding. Experiments conducted on the CARLA simulation platform under car-following, overtaking, and mixed driving scenarios—validated through 500 independent trials—demonstrate that the proposed framework substantially reduces end-to-end inference latency while maintaining stable decision-making performance. The results indicate that the system satisfies the 10 Hz real-time control requirement and significantly improves control quality, as evidenced by reduced collision rates and lower Average Jerk compared with both traditional imitation learning (Behavioral Cloning, BC) and the Transformer-based TransFuser baseline. Furthermore, sensitivity analyses confirm the robustness of the framework under environmental degradation and perception noise, underscoring the practical feasibility of deploying LLMs for safe and reliable closed-loop autonomous driving.
Links and sources
Need this topic turned into a technical roadmap?
Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.
Request B2B research
Comments