Autonomous driving paper index

SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment

2025-03-12 · Computer Vision and Pattern Recognition · arXiv: 2503.09594

autonomous drivinglidarcarlavision language modellarge language model

One-line summary

Therefore, we propose a model that can handle three different tasks: (1) closed-loop driving, (2) vision-language understanding, and (3) language-action alignment.

Engineering notes

SimLingo obtains state-of-the-art performance on the widely used CARLA simulator on the Bench2Drive benchmark and is the winning entry at the CARLA challenge 2024.

Chinese explanation / 中文解读

中文解读待补充:本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

Integrating large language models (LLMs) into autonomous driving has attracted significant attention with the hope of improving generalization and explainability. However, existing methods often focus on either driving or vision-language understanding but achieving both high driving performance and extensive language understanding remains challenging. In addition, the dominant approach to tackle vision-language understanding is using visual question answering. However, for autonomous driving, this is only useful if it is aligned with the action space. Otherwise, the model’s answers could be inconsistent with its behavior. Therefore, we propose a model that can handle three different tasks: (1) closed-loop driving, (2) vision-language understanding, and (3) language-action alignment. Our model SimLingo is based on a vision language model (VLM) and works using only camera, excluding expensive sensors like LiDAR. SimLingo obtains state-of-the-art performance on the widely used CARLA simulator on the Bench2Drive benchmark and is the winning entry at the CARLA challenge 2024. Additionally, we achieve strong results in a wide variety of language-related tasks while maintaining high driving performance. Project page: https://katrinrenz.de/simlingo

5.0Engineering value
8.5Research novelty
5.0Business relevance

Links and sources

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.
Login or register to leave a comment