ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos

Abstract

Achieving truly practical dynamic 3D reconstruction requires online operation, global pose and map consistency, detailed appearance modeling, and the flexibility to handle both RGB and RGB-D inputs. However, existing SLAM methods typically merely remove the dynamic parts or require RGB-D input, while offline methods are not scalable to long video sequences, and current transformer-based feedforward methods lack global consistency and appearance details. To this end, we achieve online dynamic scene reconstruction by disentangling the static and dynamic parts within a SLAM system. The poses are tracked robustly with a novel motion masking strategy, and dynamic parts are reconstructed leveraging a progressive adaptation of a Motion Scaffolds graph. Our method yields novel view renderings competitive to offline methods and achieves on-par tracking with state-of-the-art dynamic SLAM methods.

Cite

Text

Chen et al. "ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos." Advances in Neural Information Processing Systems, 2025.

Markdown

[Chen et al. "ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/chen2025neurips-prodyg/)

BibTeX

@inproceedings{chen2025neurips-prodyg,
  title     = {{ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos}},
  author    = {Chen, Shi and Sandström, Erik and Lombardi, Sandro and Li, Siyuan and Oswald, Martin R.},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/chen2025neurips-prodyg/}
}