Autonomous driving paper index
OM-YOLO: A Robust and Efficient Detection Model for Preserving Features on Small Traffic Signs
One-line summary
An autonomous driving research paper: OM-YOLO: A Robust and Efficient Detection Model for Preserving Features on Small Traffic Signs.
Engineering notes
Small scale traffic sign detection in intelligent transportation systems is frequently hindered by spatial resolution degradation and complex visual background interference.To address these, this study proposes OM-YOLO, an optimized model based on the YOLOv8s baseline.To enhance detection precision and reduce the overall parameter count, the baseline large object detection layer is eliminated and a dedicated small object detection head is incorporated.Specifically, Omni-Dimensional Dynamic Convolution (ODConv) is integrated into the backbone to adaptively extract rich local features from minimal pixels.Within the neck architecture, a modified Receptive Field Block (mRFB) utilizes compact dilation rates to expand the receptive area while actively preventing the destructive gridding effect.Additionally, to ensure stable bounding box regression when processing low-quality samples, the architecture integrates the Wise Intersection over Union version 3 (WIoUv3) loss function, which leverages a dynamic focusing mechanism.Empirical evaluations conducted on the TT100K dataset demonstrate that the proposed model achieves an mAP@0.5 of 90.0%, an improvement of 2.9% over the YOLOv8s model.Crucially, OM-YOLO achieves this accuracy while reducing the total parameter count by 58.5% to 4.6 million and maintaining a robust real-time inference speed of 120 FPS.Finally, cross dataset evaluations on the GTSDB and CCTSDB collections confirm its reliable spatial generalization capability, establishing OM-YOLO as a highly viable and wellbalanced architecture for edge device implementation in autonomous vehicles.
Chinese explanation / 中文解读
中文解读待补充:本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。
Original abstract
Small scale traffic sign detection in intelligent transportation systems is frequently hindered by spatial resolution degradation and complex visual background interference.To address these, this study proposes OM-YOLO, an optimized model based on the YOLOv8s baseline.To enhance detection precision and reduce the overall parameter count, the baseline large object detection layer is eliminated and a dedicated small object detection head is incorporated.Specifically, Omni-Dimensional Dynamic Convolution (ODConv) is integrated into the backbone to adaptively extract rich local features from minimal pixels.Within the neck architecture, a modified Receptive Field Block (mRFB) utilizes compact dilation rates to expand the receptive area while actively preventing the destructive gridding effect.Additionally, to ensure stable bounding box regression when processing low-quality samples, the architecture integrates the Wise Intersection over Union version 3 (WIoUv3) loss function, which leverages a dynamic focusing mechanism.Empirical evaluations conducted on the TT100K dataset demonstrate that the proposed model achieves an mAP@0.5 of 90.0%, an improvement of 2.9% over the YOLOv8s model.Crucially, OM-YOLO achieves this accuracy while reducing the total parameter count by 58.5% to 4.6 million and maintaining a robust real-time inference speed of 120 FPS.Finally, cross dataset evaluations on the GTSDB and CCTSDB collections confirm its reliable spatial generalization capability, establishing OM-YOLO as a highly viable and wellbalanced architecture for edge device implementation in autonomous vehicles.
Links and sources
Need this topic turned into a technical roadmap?
Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.
Request B2B research
Comments