Autonomous driving paper index

TBBOcc: A Lightweight Twin‐Branch Binarized Network for Efficient 3D Semantic Occupancy Prediction in Autonomous Driving

2025-01-01 · IET Intelligent Transport Systems

autonomous driving systemautonomous drivingbevoccupancy predictionoccupancydepth estimationnuscenesdeploymentprediction

One-line summary

In this paper, we propose a lightweight two‐branch binarization network, TBBOcc, to break through the bottleneck of ‘efficiency‐accuracy’ trade‐off through multi‐technology co‐optimization.

Engineering notes

Experiments show that TBBOcc achieves 39.1% mean intersection over union (mIoU) on the Occ3D‐nuScenes validation set with 32.8 M parameter counts and 164.8 G FLOPs, which reduces the amount of parameters by 26.6%, computation by 33.7%, and improves the accuracy by 3.3% compared with the baseline model FlashOcc.

Chinese explanation / 中文解读

中文解读待补充：本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

The safety decisions of autonomous driving systems rely on the accurate understanding of 3D scenes, and the existing 3D occupancy prediction (OCC) models are difficult to meet the requirements of in‐vehicle deployment due to their high computational complexity and a large number of parameters. Traditional methods (e.g., OccWorld, FlashOcc) rely on full‐precision floating‐point operations and dense 3D convolution, resulting in hundreds of millions of model parameters. In this paper, we propose a lightweight two‐branch binarization network, TBBOcc, to break through the bottleneck of ‘efficiency‐accuracy’ trade‐off through multi‐technology co‐optimization. First, we design two‐branch binarized feature extraction, using channel compression and hyperbolic tangent relaxation activation function to alleviate the problem of vanishing binarized gradient, which reduces the computation amount while retaining the key geometrical information; second, we improve the EfficientViM module by integrating state space modeling and a two‐dimensional normalization strategy, which enhances the ability of global temporal feature modeling; and lastly, we introduce a dynamic temporal fusion mechanism, combining binocular depth estimation with deformable BEV pooling to capture the spatio‐temporal evolution laws. Experiments show that TBBOcc achieves 39.1% mean intersection over union (mIoU) on the Occ3D‐nuScenes validation set with 32.8 M parameter counts and 164.8 G FLOPs, which reduces the amount of parameters by 26.6%, computation by 33.7%, and improves the accuracy by 3.3% compared with the baseline model FlashOcc. Especially, it performs well in dynamic obstacles (e.g., pedestrians, traffic cones) and complex scenes. In this paper, binarization computation is introduced into the 3D OCC task for the first time, which provides an efficient and reliable technical path for real‐time environment sensing for autonomous driving.

6.0Engineering value

7.0Research novelty

6.0Business relevance

Links and sources

Official / arXiv page

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.