MVDream: Multi-View Diffusion for 3D Generation

Abstract

We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt. Learning from both 2D and 3D data, a multi-view diffusion model can achieve the generalizability of 2D diffusion models and the consistency of 3D renderings. We demonstrate that such a multi-view diffusion model is implicitly a generalizable 3D prior agnostic to 3D representations. It can be applied to 3D generation via Score Distillation Sampling, significantly enhancing the consistency and stability of existing 2D-lifting methods. It can also learn new concepts from a few 2D examples, akin to DreamBooth, but for 3D generation.

Cite

Text

Shi et al. "MVDream: Multi-View Diffusion for 3D Generation." International Conference on Learning Representations, 2024.

Markdown

[Shi et al. "MVDream: Multi-View Diffusion for 3D Generation." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/shi2024iclr-mvdream/)

BibTeX

@inproceedings{shi2024iclr-mvdream,
  title     = {{MVDream: Multi-View Diffusion for 3D Generation}},
  author    = {Shi, Yichun and Wang, Peng and Ye, Jianglong and Mai, Long and Li, Kejie and Yang, Xiao},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/shi2024iclr-mvdream/}
}