MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views

Abstract

We introduce MVSplat360, a feed-forward approach for 360° novel view synthesis (NVS) of diverse real-world scenes, using only sparse observations. This setting is inherently ill-posed due to minimal overlap among input views and insufficient visual information provided, making it challenging for conventional methods to achieve high-quality results. Our MVSplat360 addresses this by effectively combining geometry-aware 3D reconstruction with temporally consistent video generation. Specifically, it refactors a feed-forward 3D Gaussian Splatting (3DGS) model to render features directly into the latent space of a pre-trained Stable Video Diffusion (SVD) model, where these features then act as pose and visual cues to guide the denoising process and produce photorealistic 3D-consistent views. Our model is end-to-end trainable and supports rendering arbitrary views with as few as 5 sparse input views. To evaluate MVSplat360's performance, we introduce a new benchmark using the challenging DL3DV-10K dataset, where MVSplat360 achieves superior visual quality compared to state-of-the-art methods on wide-sweeping or even 360° NVS tasks. Experiments on the existing benchmark RealEstate10K also confirm the effectiveness of our model. Readers are highly recommended to view the video results at donydchen.github.io/mvsplat360.

Cite

Text

Chen et al. "MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views." Neural Information Processing Systems, 2024. doi:10.52202/079017-3399

Markdown

[Chen et al. "MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/chen2024neurips-mvsplat360/) doi:10.52202/079017-3399

BibTeX

@inproceedings{chen2024neurips-mvsplat360,
  title     = {{MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views}},
  author    = {Chen, Yuedong and Zheng, Chuanxia and Xu, Haofei and Zhuang, Bohan and Vedaldi, Andrea and Cham, Tat-Jen and Cai, Jianfei},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3399},
  url       = {https://mlanthology.org/neurips/2024/chen2024neurips-mvsplat360/}
}