V-FUSE: Volumetric Depth mAP Fusion with Long-Range Constraints

Abstract

We introduce a learning-based depth map fusion framework that accepts a set of depth and confidence maps generated by a Multi-View Stereo (MVS) algorithm as input and improves them. This is accomplished by integrating volumetric visibility constraints that encode long-range surface relationships across different views into an end-to-end trainable architecture. We also introduce a depth search window estimation sub-network trained jointly with the larger fusion sub-network to reduce the depth hypothesis search space along each ray. Our method learns to model depth consensus and violations of visibility constraints directly from the data; effectively removing the necessity of fine-tuning fusion parameters. Extensive experiments on MVS datasets show substantial improvements in the accuracy of the output fused depth and confidence maps.

Cite

Text

Burgdorfer and Mordohai. "V-FUSE: Volumetric Depth mAP Fusion with Long-Range Constraints." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00319

Markdown

[Burgdorfer and Mordohai. "V-FUSE: Volumetric Depth mAP Fusion with Long-Range Constraints." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/burgdorfer2023iccv-vfuse/) doi:10.1109/ICCV51070.2023.00319

BibTeX

@inproceedings{burgdorfer2023iccv-vfuse,
  title     = {{V-FUSE: Volumetric Depth mAP Fusion with Long-Range Constraints}},
  author    = {Burgdorfer, Nathaniel and Mordohai, Philippos},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {3449-3458},
  doi       = {10.1109/ICCV51070.2023.00319},
  url       = {https://mlanthology.org/iccv/2023/burgdorfer2023iccv-vfuse/}
}