On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events

Abstract

Event cameras provide low-latency perception for only milliwatts of power. This makes them highly suitable for resource-restricted, agile robots such as small flying drones. Self-supervised learning based on contrast maximization holds great potential for event-based robot vision, as it foregoes the need for high-frequency ground truth and allows for online learning in the robot's operational environment. However, online, on-board learning raises the major challenge of achieving sufficient computational efficiency for real-time learning, while maintaining competitive visual perception performance. In this work, we improve the time and memory efficiency of the contrast maximization pipeline, making on-device learning of low-latency monocular depth possible. We demonstrate that online learning on board a small drone yields more accurate depth estimates and more successful obstacle avoidance behavior compared to only pre-training. Benchmarking experiments show that the proposed pipeline is not only efficient, but also achieves state-of-the-art depth estimation performance among self-supervised approaches. Our work taps into the unused potential of online, on-device robot learning, promising smaller reality gaps and better performance.

Cite

Text

Hagenaars et al. "On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01595

Markdown

[Hagenaars et al. "On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/hagenaars2025cvpr-ondevice/) doi:10.1109/CVPR52734.2025.01595

BibTeX

@inproceedings{hagenaars2025cvpr-ondevice,
  title     = {{On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events}},
  author    = {Hagenaars, Jesse J. and Wu, Yilun and Paredes-Valles, Federico and Stroobants, Stein and de Croon, Guido C.H.E.},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {17114-17123},
  doi       = {10.1109/CVPR52734.2025.01595},
  url       = {https://mlanthology.org/cvpr/2025/hagenaars2025cvpr-ondevice/}
}