Autonomous driving paper index

SE-PointFormer: An Efficient 3D Object Detection Network Based on Image Semantics and Enhanced Point Clouds

2025-07-28 · Cybersecurity and Cyberforensics Conference

autonomous driving3d object detectioninstance segmentationobject detectionlidarpoint cloudnuscenesprediction

One-line summary

With the development of autonomous driving technology, the application of 3D object detection in complex dynamic environments has become increasingly important.

Engineering notes

Key topics: autonomous driving, 3d object detection, instance segmentation, object detection, lidar, point cloud, nuscenes, prediction. See the paper for implementation details and experimental results.

Chinese explanation / 中文解读

中文解读待补充：本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

With the development of autonomous driving technology, the application of 3D object detection in complex dynamic environments has become increasingly important. However, image-based 3D object detection methods are difficult to complete accurate detection tasks due to the lack of depth information. Although LiDAR can provide more accurate 3D data, its high cost and data sparsity also limit its application scenarios. Therefore, this article proposes a 3D object detection network based on image semantics and enhanced point clouds, aiming to solve the problem of point cloud sparsity through the rich semantic information of images, enhance the expressive ability of point clouds, and thereby improve the accuracy of 3D object detection. This article first uses a two-dimensional instance segmentation model to segment images from multiple perspectives, and then assigns the extracted image semantic information to the corresponding point cloud in the semantic point cloud construction module. At the same time, a virtual point cloud with semantic information is constructed through Gaussian sampling. Subsequently, the enhanced point cloud is subjected to feature encoding and feature extraction, and a Transformer based decoder is used for preliminary prediction of the 3D target. Finally, a feature space sampling module was designed to efficiently fuse image semantic feature maps with enhanced point cloud features, further optimizing the object detection results. In the experiment, the nuScenes dataset was used for model validation, and the experimental results showed that the proposed method outperformed existing similar 3D object detection algorithms in multiple performance indicators. The mAP and NDS on the test set reached 68.6 % and 72.3 %, respectively, especially in object detection corresponding to sparse point clouds. Finally, the effectiveness of the semantic point cloud construction module and feature space sampling module in improving detection accuracy was verified through ablation experiments.

5.5Engineering value

7.0Research novelty

5.0Business relevance

Links and sources

Official / arXiv page

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.