Self-Distilled Feature Aggregation for Self-Supervised Monocular Depth Estimation

Abstract

Self-supervised monocular depth estimation has received much attention recently in computer vision. Most of the existing works in literature aggregate multi-scale features for depth prediction via either straightforward concatenation or element-wise addition, however, such feature aggregation operations generally neglect the contextual consistency between multi-scale features. Addressing this problem, we propose the Self-Distilled Feature Aggregation (SDFA) module for simultaneously aggregating a pair of low-scale and high-scale features and maintaining their contextual consistency. The SDFA employs three branches to learn three feature offset maps respectively: one offset map for refining the input low-scale feature and the other two for refining the input high-scale feature under a designed self-distillation manner. Then, we propose an SDFA-based network for self-supervised monocular depth estimation, and design a self-distilled training strategy to train the proposed network with the SDFA module. Experimental results on the KITTI dataset demonstrate that the proposed method outperforms the comparative state-of-the-art methods in most cases. The code is available at https://github.com/ZM-Zhou/SDFA-Net_pytorch.

Cite

Text

Zhou and Dong. "Self-Distilled Feature Aggregation for Self-Supervised Monocular Depth Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19769-7_41

Markdown

[Zhou and Dong. "Self-Distilled Feature Aggregation for Self-Supervised Monocular Depth Estimation." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/zhou2022eccv-selfdistilled/) doi:10.1007/978-3-031-19769-7_41

BibTeX

@inproceedings{zhou2022eccv-selfdistilled,
  title     = {{Self-Distilled Feature Aggregation for Self-Supervised Monocular Depth Estimation}},
  author    = {Zhou, Zhengming and Dong, Qiulei},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19769-7_41},
  url       = {https://mlanthology.org/eccv/2022/zhou2022eccv-selfdistilled/}
}