Autonomous driving paper index

RT-BEV: Enhancing Real-Time BEV Perception for Autonomous Vehicles

2024-12-10 · IEEE Real-Time Systems Symposium

End-to-End Autonomous Driving BEV Perception

autonomous drivingautonomous vehiclebev perceptionbevend-to-endobject detectionnuscenesperception

One-line summary

To meet this challenge, we propose RT-BEV, the first frame-work designed to co-optimize message communication and object detection to improve real-time e2e BEV perception without sacrificing accuracy.

Engineering notes

RT-BEV is shown to significantly enhances real-time BEV perception, reducing average e2e latency by $1.5 \times$, maintaining high mean Average Precision (mAP), doubling the number of processed frames, and improving the frame efficiency score (FES) by $2.9 \times$ compared to the existing approaches.

Chinese explanation / 中文解读

中文解读待补充：本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

Vision-centric Bird’s Eye View (BEV) perception has become popular for enhancing the situational awareness of autonomous vehicles (AVs). It uses multiple cameras to create a 360° view, capturing essential details for the vehicle’s navigation and decision-making. However, reducing the end-to-end (e2e) BEV perception latency without sacrificing accuracy is challenging due to the lack of co-optimization of message communication and object detection. Prior work either compresses the dense detection model to reduce computation which can hurt accuracy and assume images are well synchronized, or focuses on worstcase communication delay without considering the characteristics of object detection. To meet this challenge, we propose RT-BEV, the first frame-work designed to co-optimize message communication and object detection to improve real-time e2e BEV perception without sacrificing accuracy. The main insight of RT-BEV lies in generating traffic environment- and context-aware Regions of Interest (ROIs) for AV safety, combined with ROI-aware message communication. RT-BEV features an ROI-aware Camera Synchronizer that adaptively determines message groups and allowable delays based on ROIs’ coverage. We also develop a ROIs Generator to model context-aware ROIs and a Feature Split & Merge component to handle variable-sized ROIs effectively. Furthermore, a Time Predictor forecasts timelines for processing ROIs, and a Coordinator jointly optimizes latency and accuracy for the entire e2e pipeline. We have implemented RT-BEV in a ROS-based BEV perception pipeline and evaluated it with the nuScenes dataset. RT-BEV is shown to significantly enhances real-time BEV perception, reducing average e2e latency by $1.5 \times$, maintaining high mean Average Precision (mAP), doubling the number of processed frames, and improving the frame efficiency score (FES) by $2.9 \times$ compared to the existing approaches. Moreover, RT-BEV is shown to reduce the worst-case e2e latency by $19.3 \times$.

6.0Engineering value

7.0Research novelty

5.0Business relevance

Links and sources

Official / arXiv page

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.