Autonomous driving paper index

VLAAD: Vision and Language Assistant for Autonomous Driving

2024-01-01 · 2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)

autonomous driving

One-line summary

To address this, we introduce a multi-modal instruction tuning dataset that facilitates language models in learning visual instructions across diverse driving scenarios.

Engineering notes

Key topics: autonomous driving. See the paper for implementation details and experimental results.

Chinese explanation / 中文解读

中文解读待补充：本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

While interpretable decision-making is pivotal in au-tonomous driving, research integrating natural language models remains a relatively untapped. To address this, we introduce a multi-modal instruction tuning dataset that facilitates language models in learning visual instructions across diverse driving scenarios. This dataset encompasses three primary tasks: conversation, detailed description, and complex reasoning. Capitalizing on this dataset, we present a multi-modal LLM driving assistant named VLAAD. After fine-tuned from our instruction-following dataset, VLAAD demonstrates proficient interpretive capabilities across a spectrum of driving situations. We open our work, dataset, and model, to public on github. https://github. com/sungyeonparkk/vision-assistant-for-driving

6.5Engineering value

7.0Research novelty

5.0Business relevance

Links and sources

Official / arXiv page

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.