Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases

Abstract

Motion understanding aims to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem. An abstract action semantic (i.e., walk forwards) could be conveyed by perceptually diverse motions (walking with arms up or swinging). In contrast, a motion could carry different semantics w.r.t. its context and intention. This makes an elegant mapping between them difficult. Previous attempts adopted direct-mapping paradigms with limited reliability. Also, current automatic metrics fail to provide reliable assessments of the consistency between motions and action semantics. We identify the source of these problems as the significant gap between the two modalities. To alleviate this gap, we propose Kinematic Phrases (KP) that take the objective kinematic facts of human motion with proper abstraction, interpretability, and generality. Based on KP, we can unify a motion knowledge base and build a motion understanding system. Meanwhile, KP can be automatically converted from motions to text descriptions with no subjective bias, inspiring Kinematic Prompt Generation (KPG) as a novel white-box motion generation benchmark. In extensive experiments, our approach shows superiority over other methods. Our project is available at https://foruck.github.io/KP/.

Cite

Text

Liu et al. "Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73242-3_13

Markdown

[Liu et al. "Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/liu2024eccv-bridging/) doi:10.1007/978-3-031-73242-3_13

BibTeX

@inproceedings{liu2024eccv-bridging,
  title     = {{Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases}},
  author    = {Liu, Xinpeng and Li, Yong-Lu and Zeng, Ailing and Zhou, Zizheng and You, Yang and Lu, Cewu},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73242-3_13},
  url       = {https://mlanthology.org/eccv/2024/liu2024eccv-bridging/}
}