Deep Cost Ray Fusion for Sparse Depth Video Completion
Abstract
In this paper, we present a learning-based framework for sparse depth video completion. Given a sparse depth map and a color image at a certain viewpoint, our approach makes a cost volume that is constructed on depth hypothesis planes. To effectively fuse sequential cost volumes of the multiple viewpoints for improved depth completion, we introduce a learning-based cost volume fusion framework, namely RayFusion, that effectively leverages the attention mechanism for each pair of overlapped rays in adjacent cost volumes. As a result of leveraging feature statistics accumulated over time, our proposed framework consistently outperforms or rivals state-of-the-art approaches on diverse indoor and outdoor datasets, including the KITTI Depth Completion benchmark, VOID Depth Completion benchmark, and ScanNetV2 dataset, using much fewer network parameters.
Cite
Text
Kim et al. "Deep Cost Ray Fusion for Sparse Depth Video Completion." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73347-5_19Markdown
[Kim et al. "Deep Cost Ray Fusion for Sparse Depth Video Completion." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/kim2024eccv-deep/) doi:10.1007/978-3-031-73347-5_19BibTeX
@inproceedings{kim2024eccv-deep,
title = {{Deep Cost Ray Fusion for Sparse Depth Video Completion}},
author = {Kim, Jungeon and Kim, Soongjin and Park, Jaesik and Lee, Seungyong},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-73347-5_19},
url = {https://mlanthology.org/eccv/2024/kim2024eccv-deep/}
}