Autonomous driving paper index
FreqBEV-V2I: Frequency-Domain BEV-Enhanced Vehicle-to-Infrastructure Cooperative 3D Detection
One-line summary
In this paper, we explore a frequency domain BEV representation to address these challenges and propose the FreqBEV-V2I framework that incorporates FreqBEVFlow and FreqBEVFusion blocks.
Engineering notes
Accurate vehicle-to-infrastructure (V2I) cooperation can significantly enhance the perception performance of autonomous vehicles by leveraging information from infrastructure. Experimental results on the real-world DAIR-V2X dataset demonstrate that FreqBEV-V2I significantly outperforms current state-of-the-art methods, achieving superior 3D object detection performance and robustness across various latency conditions.
Chinese explanation / 中文解读
中文解读待补充:本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。
Original abstract
Accurate vehicle-to-infrastructure (V2I) cooperation can significantly enhance the perception performance of autonomous vehicles by leveraging information from infrastructure. However, existing cooperation methods based on spatial bird’s-eye-view (BEV) representation struggle with asynchronous temporal misalignment and heterogeneous feature collaboration, leading to 3D detection performance degradation and compromised safety. In this paper, we explore a frequency domain BEV representation to address these challenges and propose the FreqBEV-V2I framework that incorporates FreqBEVFlow and FreqBEVFusion blocks. In FreqBEVFlow, we design global filter spatial differential matching and wavelet-enhanced Fourier channel refinement networks to capture global motion variations via self-supervised learning, which effectively addresses transmission asynchronous latency. Meanwhile, FreqBEVFusion integrates features from vehicle and infrastructure with a frequency adaptive convolution network for V2I heterogeneous feature collaboration. Experimental results on the real-world DAIR-V2X dataset demonstrate that FreqBEV-V2I significantly outperforms current state-of-the-art methods, achieving superior 3D object detection performance and robustness across various latency conditions. Specifically, under ideal V2I communication conditions, FreqBEV-V2I achieves 61.59% mAP@3D (IoU=0.5), surpassing individual no-fusion and existing state-of-the-art methods by 16.77% and 5.78%, respectively. Even in latency-aware scenarios, FreqBEV-V2I maintains high accuracy with an mAP@3D (IoU=0.5) of 61.20% at 200 ms latency, significantly outperforming other methods. The code is available at https://github.com/DeepPhysicVision/FreqBEV-V2I.
Links and sources
Need this topic turned into a technical roadmap?
Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.
Request B2B research
Comments