SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Abstract

Generating reasonable and high-quality human interactive motions in a given dynamic environment is crucial for understanding, modeling, transferring, and applying human behaviors to both virtual and physical robots. In this paper, we introduce an effective method, SemGeoMo, for dynamic contextual human motion generation, which fully leverages the text-affordance-joint multi-level semantic and geometric guidance in the generation process, improving the semantic rationality and geometric correctness of generative motions. Our method achieves state-of-the-art performance on three datasets and demonstrates superior generalization capability for diverse interaction scenarios.

Cite

Text

Cong et al. "SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01636

Markdown

[Cong et al. "SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/cong2025cvpr-semgeomo/) doi:10.1109/CVPR52734.2025.01636

BibTeX

@inproceedings{cong2025cvpr-semgeomo,
  title     = {{SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance}},
  author    = {Cong, Peishan and Wang, Ziyi and Ma, Yuexin and Yue, Xiangyu},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {17561-17570},
  doi       = {10.1109/CVPR52734.2025.01636},
  url       = {https://mlanthology.org/cvpr/2025/cong2025cvpr-semgeomo/}
}