Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis
Abstract
We explore the task of embodied view synthesis from monocular videos of deformable scenes. Given a minute-long RGBD video of people interacting with their pets, we render the scene from novel camera trajectories derived from the in-scene motion of actors: (1) egocentric cameras that simulate the point of view of a target actor and (2) 3rd-person cameras that follow the actor. Building such a system requires reconstructing the root-body and articulated motion of every actor, as well as a scene representation that supports free-viewpoint synthesis. Longer videos are more likely to capture the scene from diverse viewpoints (which helps reconstruction) but are also more likely to contain larger motions (which complicates reconstruction). To address these challenges, we present Total-Recon, the first method to photorealistically reconstruct deformable scenes from long monocular RGBD videos. Crucially, to scale to long videos, our method hierarchically decomposes the scene into the background and objects, whose motion is decomposed into carefully initialized root-body motion and local articulations. To quantify such "in-the-wild" reconstruction and view synthesis, we collect ground-truth data from a specialized stereo RGBD capture rig for 11 challenging videos, significantly outperforming prior methods.
Cite
Text
Song et al. "Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01620Markdown
[Song et al. "Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/song2023iccv-totalrecon/) doi:10.1109/ICCV51070.2023.01620BibTeX
@inproceedings{song2023iccv-totalrecon,
title = {{Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis}},
author = {Song, Chonghyuk and Yang, Gengshan and Deng, Kangle and Zhu, Jun-Yan and Ramanan, Deva},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {17671-17682},
doi = {10.1109/ICCV51070.2023.01620},
url = {https://mlanthology.org/iccv/2023/song2023iccv-totalrecon/}
}