Class-Agnostic Reconstruction of Dynamic Objects from Videos

Abstract

We introduce REDO, a class-agnostic framework to REconstruct the Dynamic Objects from RGBD or calibrated videos. Compared to prior work, our problem setting is more realistic yet more challenging for three reasons: 1) due to occlusion or camera settings an object of interest may never be entirely visible, but we aim to reconstruct the complete shape; 2) we aim to handle different object dynamics including rigid motion, non-rigid motion, and articulation; 3) we aim to reconstruct different categories of objects with one unified framework. To address these challenges, we develop two novel modules. First, we introduce a canonical 4D implicit function which is pixel-aligned with aggregated temporal visual cues. Second, we develop a 4D transformation module which captures object dynamics to support temporal propagation and aggregation. We study the efficacy of REDO in extensive experiments on synthetic RGBD video datasets SAIL-VOS 3D and DeformingThings4D++, and on real-world video data 3DPW. We find REDO outperforms state-of-the-art dynamic reconstruction methods by a margin. In ablation studies we validate each developed component.

PDF NeurIPS OpenReview Code Semantic Scholar

Cite

Text

Ren et al. "Class-Agnostic Reconstruction of Dynamic Objects from Videos." Neural Information Processing Systems, 2021.

Markdown

[Ren et al. "Class-Agnostic Reconstruction of Dynamic Objects from Videos." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/ren2021neurips-classagnostic/)

BibTeX

@inproceedings{ren2021neurips-classagnostic,
  title     = {{Class-Agnostic Reconstruction of Dynamic Objects from Videos}},
  author    = {Ren, Zhongzheng and Zhao, Xiaoming and Schwing, Alex},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/ren2021neurips-classagnostic/}
}