Feature-Metric Loss for Self-Supervised Learning of Depth and Egomotion

Abstract

Photometric loss is widely used for self-supervised depth and egomotion estimation. However, the loss landscapes induced by photometric differences are often problematic for optimization, caused by plateau landscapes for pixels in texture-less regions or multiple local minima for less discriminative pixels. In this work, feature-metric loss is proposed and defined on feature representation, where the feature representation is also learned in a self-supervised manner and regularized by both first-order and second-order derivatives to constrain the loss landscapes to form proper convergence basins. Comprehensive experiments and detailed analysis via visualization demonstrate the effectiveness of the proposed feature-metric loss. In particular, our method improves state-of-the-art methods on KITTI from 0.885 to 0.925 measured by $\delta_1$ for depth estimation, and significantly outperforms previous method for visual odometry.

Cite

Text

Shu et al. "Feature-Metric Loss for Self-Supervised Learning of Depth and Egomotion." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58529-7_34

Markdown

[Shu et al. "Feature-Metric Loss for Self-Supervised Learning of Depth and Egomotion." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/shu2020eccv-featuremetric/) doi:10.1007/978-3-030-58529-7_34

BibTeX

@inproceedings{shu2020eccv-featuremetric,
  title     = {{Feature-Metric Loss for Self-Supervised Learning of Depth and Egomotion}},
  author    = {Shu, Chang and Yu, Kun and Duan, Zhixiang and Yang, Kuiyuan},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58529-7_34},
  url       = {https://mlanthology.org/eccv/2020/shu2020eccv-featuremetric/}
}