ATT3D: Amortized Text-to-3D Object Synthesis

Jonathan Lorraine, Kevin Xie, Xiaohui Zeng, Chen-Hsuan Lin, Towaki Takikawa, Nicholas Sharp, Tsung-Yi Lin, Ming-Yu Liu, Sanja Fidler, James Lucas

ICCV 2023 pp. 17946-17956

doi:10.1109/ICCV51070.2023.01645 /iccv/2023/lorraine2023iccv-att3d/

Abstract

Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently achieved high-quality results but requires a lengthy, per-prompt optimization to create 3D objects. To address this, we amortize optimization over text prompts by training on many prompts simultaneously with a unified model instead of separately. With this, we share computation across a prompt set, training in less time than per-prompt optimization. Our framework, Amortized Text-to-3D (ATT3D), enables knowledge sharing between prompts to generalize to unseen setups and smooth interpolations between text for novel assets and simple animations.

PDF ICCV Semantic Scholar

Cite

Text

Lorraine et al. "ATT3D: Amortized Text-to-3D Object Synthesis." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01645

Markdown

[Lorraine et al. "ATT3D: Amortized Text-to-3D Object Synthesis." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/lorraine2023iccv-att3d/) doi:10.1109/ICCV51070.2023.01645

BibTeX

@inproceedings{lorraine2023iccv-att3d,
  title     = {{ATT3D: Amortized Text-to-3D Object Synthesis}},
  author    = {Lorraine, Jonathan and Xie, Kevin and Zeng, Xiaohui and Lin, Chen-Hsuan and Takikawa, Towaki and Sharp, Nicholas and Lin, Tsung-Yi and Liu, Ming-Yu and Fidler, Sanja and Lucas, James},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {17946-17956},
  doi       = {10.1109/ICCV51070.2023.01645},
  url       = {https://mlanthology.org/iccv/2023/lorraine2023iccv-att3d/}
}