RoHM: Robust Human Motion Reconstruction via Diffusion

Siwei Zhang, Bharat Lal Bhatnagar, Yuanlu Xu, Alexander Winkler, Petr Kadlecek, Siyu Tang, Federica Bogo

CVPR 2024 pp. 14606-14617

doi:10.1109/CVPR52733.2024.01384 /cvpr/2024/zhang2024cvpr-rohm/

Abstract

We propose RoHM an approach for robust 3D human motion reconstruction from monocular RGB(-D) videos in the presence of noise and occlusions. Most previous approaches either train neural networks to directly regress motion in 3D or learn data-driven motion priors and combine them with optimization at test time. RoHM is a novel diffusion-based motion model that conditioned on noisy and occluded input data reconstructs complete plausible motions in consistent global coordinates. Given the complexity of the problem -- requiring one to address different tasks (denoising and infilling) in different solution spaces (local and global motion) -- we decompose it into two sub-tasks and learn two models one for global trajectory and one for local motion. To capture the correlations between the two we then introduce a novel conditioning module combining it with an iterative inference scheme. We apply RoHM to a variety of tasks -- from motion reconstruction and denoising to spatial and temporal infilling. Extensive experiments on three popular datasets show that our method outperforms state-of-the-art approaches qualitatively and quantitatively while being faster at test time. The code is available at https://sanweiliti.github.io/ROHM/ROHM.html.

PDF CVPR Semantic Scholar

Cite

Text

Zhang et al. "RoHM: Robust Human Motion Reconstruction via Diffusion." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.01384

Markdown

[Zhang et al. "RoHM: Robust Human Motion Reconstruction via Diffusion." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/zhang2024cvpr-rohm/) doi:10.1109/CVPR52733.2024.01384

BibTeX

@inproceedings{zhang2024cvpr-rohm,
  title     = {{RoHM: Robust Human Motion Reconstruction via Diffusion}},
  author    = {Zhang, Siwei and Bhatnagar, Bharat Lal and Xu, Yuanlu and Winkler, Alexander and Kadlecek, Petr and Tang, Siyu and Bogo, Federica},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {14606-14617},
  doi       = {10.1109/CVPR52733.2024.01384},
  url       = {https://mlanthology.org/cvpr/2024/zhang2024cvpr-rohm/}
}