Efficient Multi-View Stereo by Iterative Dynamic Cost Volume

Abstract

In this paper, we propose a novel iterative dynamic cost volume for multi-view stereo. Compared with other works, our cost volume is much lighter, thus could be processed with 2D convolution based GRU. Notably, the every-step output of the GRU could be further used to generate new cost volume. In this way, an iterative GRU-based optimizer is constructed. Furthermore, we present a cascade and hierarchical refinement architecture to utilize the multi-scale information and speed up the convergence. Specifically, a lightweight 3D CNN is utilized to generate the coarsest initial depth map which is essential to launch the GRU and guarantee a fast convergence. Then the depth map is refined by multi-stage GRUs which work on the pyramid feature maps. Extensive experiments on DTU and Tanks & Temples benchmarks demonstrate that our method could achieve state-of-the-art results in terms of accuracy, speed and memory usage. Code will be released at https://github.com/bdwsq1996/Effi-MVS.

Cite

Text

Wang et al. "Efficient Multi-View Stereo by Iterative Dynamic Cost Volume." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00846

Markdown

[Wang et al. "Efficient Multi-View Stereo by Iterative Dynamic Cost Volume." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/wang2022cvpr-efficient/) doi:10.1109/CVPR52688.2022.00846

BibTeX

@inproceedings{wang2022cvpr-efficient,
  title     = {{Efficient Multi-View Stereo by Iterative Dynamic Cost Volume}},
  author    = {Wang, Shaoqian and Li, Bo and Dai, Yuchao},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {8655-8664},
  doi       = {10.1109/CVPR52688.2022.00846},
  url       = {https://mlanthology.org/cvpr/2022/wang2022cvpr-efficient/}
}