Autonomous driving paper index

Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving

2023-10-03 · IEEE International Conference on Robotics and Automation · arXiv: 2310.01957

autonomous drivinglarge language modelcontrol

One-line summary

We introduce a unique objectlevel multimodal LLM architecture that merges vectorized numeric modalities with a pre-trained LLM to improve context understanding in driving situations.

Engineering notes

We make our benchmark, datasets, and model available1 for further exploration.

Chinese explanation / 中文解读

中文解读待补充：本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

Large Language Models (LLMs) have shown promise in the autonomous driving sector, particularly in generalization and interpretability. We introduce a unique objectlevel multimodal LLM architecture that merges vectorized numeric modalities with a pre-trained LLM to improve context understanding in driving situations. We also present a new dataset of 160k QA pairs derived from 10k driving scenarios, paired with high quality control commands collected with RL agent and question answer pairs generated by teacher LLM (GPT-3.5). A distinct pretraining strategy is devised to align numeric vector modalities with static LLM representations using vector captioning language data. We also introduce an evaluation metric for Driving QA and demonstrate our LLM-driver’s proficiency in interpreting driving scenarios, answering questions, and decision-making. Our findings highlight the potential of LLM-based driving action generation in comparison to traditional behavioral cloning. We make our benchmark, datasets, and model available1 for further exploration.

5.0Engineering value

7.5Research novelty

5.0Business relevance

Links and sources

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.