Autonomous driving paper index

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

2025-01-07 · IEEE International Conference on Computer Vision · arXiv: 2501.04003

autonomous driving systemautonomous drivingperceptionplanning

One-line summary

To address this, we introduce DriveBench, a benchmark eval-uating 12 VLMs across 17 settings, covering 19,200 images, 20,498 QA pairs, and four key driving tasks.

Engineering notes

To address this, we introduce DriveBench, a benchmark eval-uating 12 VLMs across 17 settings, covering 19,200 images, 20,498 QA pairs, and four key driving tasks.

Chinese explanation / 中文解读

中文解读待补充：本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

Recent advancements in Vision-Language Models (VLMs) have fueled interest in autonomous driving applications, particularly for interpretable decision-making. However, the assumption that VLMs provide visually grounded and reliable driving explanations remains unexamined. To address this, we introduce DriveBench, a benchmark eval-uating 12 VLMs across 17 settings, covering 19,200 images, 20,498 QA pairs, and four key driving tasks. Our findings reveal that existing VLMs often generate plausible responses from general knowledge or textual cues rather than true visual grounding, especially under degraded or missing visual inputs. This behavior, concealed by dataset imbalances and insufficient evaluation metrics, poses significant risks in safety-critical scenarios like autonomous driving. We further observe that VLMs possess inherent corruption-awareness but only explicitly acknowledge these issues when directly prompted. Given the challenges and inspired by the inherent corruption awareness, we propose Robust Agentic Utilization (RAU), leveraging VLMs' corruption awareness and agentic planning with external tools to enhance perception reliability for a diverse set of downstream tasks. Our study challenges existing evaluation paradigms and provides a road map toward more robust and interpretable autonomous driving systems.

5.0Engineering value

7.0Research novelty

5.0Business relevance

Links and sources

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.