Disambiguating Monocular Depth Estimation with a Single Transient
Abstract
Monocular depth estimation algorithms successfully predict the relative depth order of objects in a scene. However, because of the fundamental scale ambiguity associated with monocular images, these algorithms fail at correctly predicting true metric depth. In this work, we demonstrate how a depth histogram of the scene, which can be readily captured using a single-pixel time-resolved detector, can be fused with the output of existing monocular depth estimation algorithms to resolve the depth ambiguity problem. We validate this novel sensor fusion technique experimentally and in extensive simulation. We show that it significantly improves the performance of several state-of-the-art monocular depth estimation algorithms.
Cite
Text
Nishimura et al. "Disambiguating Monocular Depth Estimation with a Single Transient." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58589-1_9Markdown
[Nishimura et al. "Disambiguating Monocular Depth Estimation with a Single Transient." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/nishimura2020eccv-disambiguating/) doi:10.1007/978-3-030-58589-1_9BibTeX
@inproceedings{nishimura2020eccv-disambiguating,
title = {{Disambiguating Monocular Depth Estimation with a Single Transient}},
author = {Nishimura, Mark and Lindell, David B. and Metzler, Christopher and Wetzstein, Gordon},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2020},
doi = {10.1007/978-3-030-58589-1_9},
url = {https://mlanthology.org/eccv/2020/nishimura2020eccv-disambiguating/}
}