Feature Contribution in Monocular Depth Estimation

Abstract

Monocular Depth Estimation (MDE) is an inherently ill-posed problem due to the lack of binocular depth cues, despite this there have been significant research done in this field in recent years. In an attempt to bridge understanding between human and machine perception, this paper investigates learned concepts from the general-purpose model Depth Anything, focusing on features that are known to be present in the human visual system. We perform interventions on different image features within the KITTI and NYUv2 dataset, evaluating performance on these intervened inputs. This led to interesting insights on how and how much each of these features influence depth perception. These insights contribute to bridging understanding of how humans and machines perform MDE respectively, and we also hope it provides a new way for future work to devise more robust methods of training neural networks for MDE.

Cite

Text

Lau et al. "Feature Contribution in Monocular Depth Estimation." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-92648-8_16

Markdown

[Lau et al. "Feature Contribution in Monocular Depth Estimation." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/lau2024eccvw-feature/) doi:10.1007/978-3-031-92648-8_16

BibTeX

@inproceedings{lau2024eccvw-feature,
  title     = {{Feature Contribution in Monocular Depth Estimation}},
  author    = {Lau, Hui Yu and Dasmahapatra, Srinandan and Kim, Hansung},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2024},
  pages     = {251-265},
  doi       = {10.1007/978-3-031-92648-8_16},
  url       = {https://mlanthology.org/eccvw/2024/lau2024eccvw-feature/}
}