Autonomous driving paper index
Monocular Depth Estimation on an Edge Device Using MiDaS DPT Hybrid
One-line summary
In this paper, we present the design and implementation of an affordable and light-weight depth estimation system executed on a Raspberry Pi 5 platform.
Engineering notes
Key topics: autonomous driving, depth estimation, monocular depth, object detection. See the paper for implementation details and experimental results.
Chinese explanation / 中文解读
中文解读待补充:本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。
Original abstract
High-precision and real-time monocular depth estimation has become an enabling technology for robotics, autonomous navigation, and industrial safety applications. Depth estimation using deep neural networks is computationally intensive and typically must be executed on GPU-class hardware, so deploying these on resource-constrained edge devices is highly challenging. In this paper, we present the design and implementation of an affordable and light-weight depth estimation system executed on a Raspberry Pi 5 platform. The system is built on the MiDaS DPT-Hybrid model as the underlying framework, which we have optimized by converting the original PyTorch framework to TensorFlow Lite (TFLite) for reducing model size, memory footprint, and inference latency. The optimized model can execute real-time video streams from the Raspberry Pi camera interface, generating dense depth maps that are post-processed and yielding approximate distance values in metric units. Threshold-based decision logic is included to deliver alerts (e.g., buzzer activation) on object detection or human presence in critical safety zones, thereby demonstrating the system to be effective for collision avoidance and human-machine interaction monitoring in applications. Experiments carried out here demonstrate a significant FPS performance boost following TFLite conversion, thereby demonstrating the effectiveness of the optimization pipeline. The hardware platform is also modular with support for Raspberry Pi 5, camera interface, and Coral Edge TPU acceleration. CPU-only execution is possible in this current implementation, but future releases will include a Coral USB TPU to enhance throughput even further and reduce inference latency. An IR-cut switchable camera module is also being considered to introduce reliability with varying light levels, and hence system capability will extend to night-time and low-light operation. This paper provides evidence that sophisticated depth estimation models can be optimized, compressed, and executed in an efficient manner on embedded hardware, offering a scalable solution to real-world low-cost applications where real-time inference, portability, and low power are of the utmost importance.
Links and sources
Need this topic turned into a technical roadmap?
Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.
Request B2B research
Comments