Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling

Yuliang Zou, Pan Ji, Quoc-Huy Tran, Jia-Bin Huang, Manmohan Chandraker

ECCV 2020

doi:10.1007/978-3-030-58568-6_42 /eccv/2020/zou2020eccv-learning/

Abstract

Monocular visual odometry (VO) suffers severely from error accumulation during frame-to-frame pose estimation. In this paper, we present a self-supervised learning method for VO with special consideration for consistency over longer sequences. To this end, we model the long-term dependency in pose prediction using a pose network that features a two-layer convolutional LSTM module. We train the networks with purely self-supervised losses, including a cycle consistency loss that mimics the loop closure module in geometric VO. Inspired by prior geometric systems, we allow the networks to see beyond a small temporal window during training, through a novel a loss that incorporates temporally distant (g $O(100)$) frames. Given GPU memory constraints, we propose a stage-wise training mechanism, where the first stage operates in a local time window and the second stage refines the poses with a ``global'' loss given the first stage features. We demonstrate competitive results on several standard VO datasets, including KITTI and TUM RGB-D.

PDF ECCV Semantic Scholar

Cite

Text

Zou et al. "Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58568-6_42

Markdown

[Zou et al. "Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/zou2020eccv-learning/) doi:10.1007/978-3-030-58568-6_42

BibTeX

@inproceedings{zou2020eccv-learning,
  title     = {{Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling}},
  author    = {Zou, Yuliang and Ji, Pan and Tran, Quoc-Huy and Huang, Jia-Bin and Chandraker, Manmohan},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58568-6_42},
  url       = {https://mlanthology.org/eccv/2020/zou2020eccv-learning/}
}