MVDD: Multi-View Depth Diffusion Models

Wang, Zhen; Xu, Qiangeng; Tan, Feitong; Chai, Menglei; Liu, Shichen; Pandey, Rohit; Fanello, Sean; Kadambi, Achuta; Zhang, Yinda

doi:10.1007/978-3-031-72624-8_14

MVDD: Multi-View Depth Diffusion Models

Zhen Wang, Qiangeng Xu, Feitong Tan, Menglei Chai, Shichen Liu, Rohit Pandey, Sean Fanello, Achuta Kadambi, Yinda Zhang

ECCV 2024

doi:10.1007/978-3-031-72624-8_14 /eccv/2024/wang2024eccv-mvdd/

Abstract

Denoising diffusion models have demonstrated outstanding results in 2D image generation, yet it remains a challenge to replicate its success in 3D shape generation. In this paper, we propose leveraging multi-view depth, which represents complex 3D shapes in a 2D data format that is easy to denoise. We pair this representation with a diffusion model, MVDD, that is capable of generating high-quality dense point clouds with 20K+ points with fine-grained details. To enforce 3D consistency in multi-view depth, we introduce an epipolar line segment attention that conditions the denoising step for a view on its neighboring views. Additionally, a depth fusion module is incorporated into diffusion steps to further ensure the alignment of depth maps. When augmented with surface reconstruction, MVDD can also produce high-quality 3D meshes. Furthermore, MVDD stands out in other tasks such as depth completion, and can serve as a 3D prior, significantly boosting many downstream tasks, such as GAN inversion. State-of-the-art results from extensive experiments demonstrate MVDD’s excellent ability in 3D shape generation, depth completion, and its potential as a 3D prior for downstream tasks.

PDF ECCV Semantic Scholar

Cite

Text

Wang et al. "MVDD: Multi-View Depth Diffusion Models." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72624-8_14

Markdown

[Wang et al. "MVDD: Multi-View Depth Diffusion Models." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/wang2024eccv-mvdd/) doi:10.1007/978-3-031-72624-8_14

BibTeX

@inproceedings{wang2024eccv-mvdd,
  title     = {{MVDD: Multi-View Depth Diffusion Models}},
  author    = {Wang, Zhen and Xu, Qiangeng and Tan, Feitong and Chai, Menglei and Liu, Shichen and Pandey, Rohit and Fanello, Sean and Kadambi, Achuta and Zhang, Yinda},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72624-8_14},
  url       = {https://mlanthology.org/eccv/2024/wang2024eccv-mvdd/}
}