TartanVO: A Generalizable Learning-Based VO

Abstract

We present the first learning-based visual odometry (VO) model, which generalizes to multiple datasets and real-world scenarios and outperforms geometry-based methods in challenging scenes. We achieve this by leveraging the SLAM dataset TartanAir, which provides a large amount of diverse synthetic data in challenging environments. Furthermore, to make our VO model generalize across datasets, we propose an up-to-scale loss function and incorporate the camera intrinsic parameters into the model. Experiments show that a single model, TartanVO, trained only on synthetic data, without any finetuning, can be generalized to real-world datasets such as KITTI and EuRoC, demonstrating significant advantages over the geometry-based methods on challenging trajectories. Our code is available at https://github.com/castacks/tartanvo.

Cite

Text

Wang et al. "TartanVO: A Generalizable Learning-Based VO." Conference on Robot Learning, 2020.

Markdown

[Wang et al. "TartanVO: A Generalizable Learning-Based VO." Conference on Robot Learning, 2020.](https://mlanthology.org/corl/2020/wang2020corl-tartanvo/)

BibTeX

@inproceedings{wang2020corl-tartanvo,
  title     = {{TartanVO: A Generalizable Learning-Based VO}},
  author    = {Wang, Wenshan and Hu, Yaoyu and Scherer, Sebastian},
  booktitle = {Conference on Robot Learning},
  year      = {2020},
  pages     = {1761-1772},
  volume    = {155},
  url       = {https://mlanthology.org/corl/2020/wang2020corl-tartanvo/}
}