Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Abstract
Text-to-3D generation has shown rapid progress in recent days with the advent of score distillation sampling (SDS), a methodology of using pretrained text-to-2D diffusion models to optimize a neural radiance field (NeRF) in a zero-shot setting. However, the lack of 3D awareness in the 2D diffusion model often destabilizes previous methods from generating a plausible 3D scene. To address this issue, we propose 3DFuse, a novel framework that incorporates 3D awareness into the pretrained 2D diffusion model, enhancing the robustness and 3D consistency of score distillation-based methods. Specifically, we introduce a consistency injection module which constructs a 3D point cloud from the text prompt and utilizes its projected depth map at given view as a condition for the diffusion model. The 2D diffusion model, through its generative capability, robustly infers dense structure from the sparse point cloud depth map and generates a geometrically consistent and coherent 3D scene. We also introduce a new technique called semantic coding that reduces semantic ambiguity of the text prompt for improved results. Our method can be easily adapted to various text-to-3D baselines, and we experimentally demonstrate how our method notably enhances the 3D consistency of generated scenes in comparison to previous baselines, achieving state-of-the-art performance in geometric robustness and fidelity.
Cite
Text
Seo et al. "Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation." International Conference on Learning Representations, 2024.Markdown
[Seo et al. "Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/seo2024iclr-let/)BibTeX
@inproceedings{seo2024iclr-let,
title = {{Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation}},
author = {Seo, Junyoung and Jang, Wooseok and Kwak, Min-Seop and Kim, Hyeonsu and Ko, Jaehoon and Kim, Junho and Kim, Jin-Hwa and Lee, Jiyoung and Kim, Seungryong},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/seo2024iclr-let/}
}