ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image
Abstract
We introduce a 3D-aware diffusion model ZeroNVS for single-image novel view synthesis for in-the-wild scenes. While existing methods are designed for single objects with masked backgrounds we propose new techniques to address challenges introduced by in-the-wild multi-object scenes with complex backgrounds. Specifically we train a generative prior on a mixture of data sources that capture object-centric indoor and outdoor scenes. To address issues from data mixture such as depth-scale ambiguity we propose a novel camera conditioning parameterization and normalization scheme. Further we observe that Score Distillation Sampling (SDS) tends to truncate the distribution of complex backgrounds during distillation of 360-degree scenes and propose "SDS anchoring" to improve the diversity of synthesized novel views. Our model sets a new state-of-the-art result in LPIPS on the DTU dataset in the zero-shot setting even outperforming methods specifically trained on DTU. We further adapt the challenging Mip-NeRF 360 dataset as a new benchmark for single-image novel view synthesis and demonstrate strong performance in this setting. Code and models will be publicly available.
Cite
Text
Sargent et al. "ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00900Markdown
[Sargent et al. "ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/sargent2024cvpr-zeronvs/) doi:10.1109/CVPR52733.2024.00900BibTeX
@inproceedings{sargent2024cvpr-zeronvs,
title = {{ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image}},
author = {Sargent, Kyle and Li, Zizhang and Shah, Tanmay and Herrmann, Charles and Yu, Hong-Xing and Zhang, Yunzhi and Chan, Eric Ryan and Lagun, Dmitry and Fei-Fei, Li and Sun, Deqing and Wu, Jiajun},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2024},
pages = {9420-9429},
doi = {10.1109/CVPR52733.2024.00900},
url = {https://mlanthology.org/cvpr/2024/sargent2024cvpr-zeronvs/}
}