SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis
Abstract
Text-based generation and editing of 3D scenes hold significant potential for streamlining content creation through intuitive user interactions. While recent advances leverage 3D Gaussian Splatting (3DGS) for high-fidelity and real-time rendering, existing methods are often specialized and task-focused, lacking a unified framework for both generation and editing. In this paper, we introduce SplatFlow, a comprehensive framework that addresses this gap by enabling direct 3DGS generation and editing. SplatFlow comprises two main components: a multi-view rectified flow (RF) model and a Gaussian Splatting Decoder (GSDecoder). The multi-view RF model operates in latent space, generating multi-view images, depths, and camera poses simultaneously, conditioned on text prompts--thus addressing challenges like diverse scene scales and complex camera trajectories in real-world settings. Then, the GSDecoder efficiently translates these latent outputs into 3DGS representations through a feed-forward 3DGS method. Leveraging training-free inversion and inpainting techniques, SplatFlow enables seamless 3DGS editing and supports a broad range of 3D tasks--including object editing, novel view synthesis, and camera pose estimation--within a unified framework without requiring additional complex pipelines. We validate SplatFlow's capabilities on the MVImgNet and DL3DV-7K datasets, demonstrating its versatility and effectiveness in various 3D generation, editing, and inpainting-based tasks.
Cite
Text
Go et al. "SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02005Markdown
[Go et al. "SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/go2025cvpr-splatflow/) doi:10.1109/CVPR52734.2025.02005BibTeX
@inproceedings{go2025cvpr-splatflow,
title = {{SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis}},
author = {Go, Hyojun and Park, Byeongjun and Jang, Jiho and Kim, Jin-Young and Kwon, Soonwoo and Kim, Changick},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2025},
pages = {21524-21536},
doi = {10.1109/CVPR52734.2025.02005},
url = {https://mlanthology.org/cvpr/2025/go2025cvpr-splatflow/}
}