Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes

Abstract

Synthesizing 3D human motion plays an important role in many graphics applications as well as understanding human activity. While many efforts have been made on generating realistic and natural human motion, most approaches neglect the importance of modeling human-scene interactions and affordances. On the other hand, affordance reasoning (e.g., standing on the floor or sitting on the chair) has mainly been studied with static human pose and gestures, and it has rarely been addressed with human motion. In this paper, we propose to bridge human motion synthesis and scene affordance reasoning. We present a hierarchical generative framework which synthesizes long-term 3D human motion conditioning on the 3D scene structure. We also further enforce multiple geometry constraints between the human mesh and scene point clouds via optimization to improve realistic synthesis. Our experiments show significant improvements over previous approaches on generating natural and physically plausible human motion in a scene.

Cite

Text

Wang et al. "Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.00928

Markdown

[Wang et al. "Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/wang2021cvpr-synthesizing/) doi:10.1109/CVPR46437.2021.00928

BibTeX

@inproceedings{wang2021cvpr-synthesizing,
  title     = {{Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes}},
  author    = {Wang, Jiashun and Xu, Huazhe and Xu, Jingwei and Liu, Sifei and Wang, Xiaolong},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {9401-9411},
  doi       = {10.1109/CVPR46437.2021.00928},
  url       = {https://mlanthology.org/cvpr/2021/wang2021cvpr-synthesizing/}
}