Autonomous driving paper index

OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving

2024-12-19 · 2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) · arXiv: 2412.15208

end-to-end autonomous drivingautonomous drivingend-to-endlarge language model

One-line summary

Drawing inspiration from recent advancements in inference computing, we propose OpenEMMA, an open-source end-to-end framework based on MLLMs.

Engineering notes

Drawing inspiration from recent advancements in inference computing, we propose OpenEMMA, an open-source end-to-end framework based on MLLMs. By incor-porating the Chain-of- Thought reasoning process, Open-EMMA achieves significant improvements compared to the baseline when leveraging a diverse range of MLLMs.

Chinese explanation / 中文解读

中文解读待补充:本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

Since the advent of Multimodal Large Language Models (MLLMs), they have made a significant impact across a wide range of real-world applications, particularly in Autonomous Driving (AD). Their ability to process complex visual data and reason about intricate driving scenarios has paved the way for a new paradigm in end-to-end AD systems. However, the progress of developing end-to-end models for AD has been slow, as existing fine-tuning methods demand substantial resources, including extensive computational power, large-scale datasets, and significant funding. Drawing inspiration from recent advancements in inference computing, we propose OpenEMMA, an open-source end-to-end framework based on MLLMs. By incor-porating the Chain-of- Thought reasoning process, Open-EMMA achieves significant improvements compared to the baseline when leveraging a diverse range of MLLMs. Fur-thermore, OpenEMMA demonstrates effectiveness, gener-alizability, and robustness across a variety of challenging driving scenarios, offering a more efficient and effective approach to autonomous driving. We release all the codes in https://github.com/taco-group/OpenEMMA.

7.5Engineering value
7.5Research novelty
5.5Business relevance

Links and sources

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.
Login or register to leave a comment