Autonomous driving paper index

Height3D: A Roadside Visual Framework Based on Height Prediction in Real 3-D Space

2025-07-01 · IEEE transactions on intelligent transportation systems (Print)

autonomous drivingautonomous vehiclebev3d object detectionobject detectionperceptionprediction

One-line summary

In recent years, vision-based roadside 3D object detection has received a great deal of attention, which is an important part of the Intelligent Transportation System (ITS).

Engineering notes

The proposed method is applied to two large-scale roadside benchmarks, DAIR-V2X-I and Rope3D. The proposed Height3D outperforms the state-of-the-art methods of (1.15, 7.37, 4.03) Average Precision (AP) for Vehicle, Pedestrian and Cyclist categories in 3D object detection task, respectively.

Chinese explanation / 中文解读

中文解读待补充：本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

In recent years, vision-based roadside 3D object detection has received a great deal of attention, which is an important part of the Intelligent Transportation System (ITS). It extends the perception range beyond the limitations of Autonomous Vehicle (AV) and enhances road safety. While previous work mainly focuses on height prediction in image 2D space, which is limited by the perspective property of near-large and far-small on images, making it difficult for network to understand real dimension of targets in the 3D world. Inspired by this insight, a roadside visual framework Height3D based on height prediction in real 3D space, is proposed. Height Prediction Block (HPB) with explicit height supervision is proposed in real 3D space instead of in image 2D space to predict the height distribution of targets for roadside view transform. Also, Spatial Aware Block (SAB) is used to further extracts spatial context information in BEV space and enhances fine-grained BEV features. The proposed method is applied to two large-scale roadside benchmarks, DAIR-V2X-I and Rope3D. Extensive experiments are performed to verify its effectiveness. The proposed Height3D outperforms the state-of-the-art methods of (1.15, 7.37, 4.03) Average Precision (AP) for Vehicle, Pedestrian and Cyclist categories in 3D object detection task, respectively. Meanwhile, the proposed method achieves 31.55 FPS without using any CUDA or TensorRT acceleration. The code is available at https://github.com/zhangzhang2024/Height3D

6.5Engineering value

8.0Research novelty

5.0Business relevance

Links and sources

Official / arXiv page

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.