SynCity: Training-Free Generation of 3D Worlds
Abstract
We propose SynCity, a method for generating explorable 3D worlds from textual descriptions. Our approach leverages pre-trained textual, image, and 3D generators without requiring fine-tuning or inference-time optimization. While most 3D generators are object-centric and unable to create large-scale worlds, we demonstrate how 2D and 3D generators can be combined to produce ever-expanding scenes. The world is generated tile by tile, with each new tile created within its context and seamlessly integrated into the scene. SynCity enables fine-grained control over the appearance and layout of the generated worlds, which are both detailed and diverse.
Cite
Text
Engstler et al. "SynCity: Training-Free Generation of 3D Worlds." International Conference on Computer Vision, 2025.Markdown
[Engstler et al. "SynCity: Training-Free Generation of 3D Worlds." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/engstler2025iccv-syncity/)BibTeX
@inproceedings{engstler2025iccv-syncity,
title = {{SynCity: Training-Free Generation of 3D Worlds}},
author = {Engstler, Paul and Shtedritski, Aleksandar and Laina, Iro and Rupprecht, Christian and Vedaldi, Andrea},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {27585-27595},
url = {https://mlanthology.org/iccv/2025/engstler2025iccv-syncity/}
}