CeMNet: Self-Supervised Learning for Accurate Continuous Ego-Motion Estimation

Abstract

In this paper, we propose a self-supervised learning approach for estimating continuous ego-motion from video. Our model learns to estimate camera motion by watching RGBD or RGB video streams and determining translational and rotation velocities that correctly predict the appearance of future frames. Our approach differs from other recent work on self-supervised structure-from-motion in its use of a continuous motion formulation and representation of rigid motion fields rather than direct prediction of camera parameters. To make estimation robust in dynamic environments with multiple moving objects, we introduce a simple two-component segmentation process that isolates the rigid background environment from dynamic scene elements. We demonstrate state-of-the-art accuracy of the self-trained model on several benchmark egomotion datasets and highlight the ability of the model to provide superior rotational accuracy and handling of nonrigid scene motions.

Cite

Text

Lee and Fowlkes. "CeMNet: Self-Supervised Learning for Accurate Continuous Ego-Motion Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019. doi:10.1109/CVPRW.2019.00048

Markdown

[Lee and Fowlkes. "CeMNet: Self-Supervised Learning for Accurate Continuous Ego-Motion Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.](https://mlanthology.org/cvprw/2019/lee2019cvprw-cemnet/) doi:10.1109/CVPRW.2019.00048

BibTeX

@inproceedings{lee2019cvprw-cemnet,
  title     = {{CeMNet: Self-Supervised Learning for Accurate Continuous Ego-Motion Estimation}},
  author    = {Lee, Minhaeng and Fowlkes, Charless C.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2019},
  pages     = {354-363},
  doi       = {10.1109/CVPRW.2019.00048},
  url       = {https://mlanthology.org/cvprw/2019/lee2019cvprw-cemnet/}
}