Text-to-Any-Skeleton Motion Generation Without Retargeting

Abstract

Recent advances in text-driven motion generation have shown notable advancements. However, these works are typically limited to standardized skeletons and rely on a cumbersome retargeting process to adapt to varying skeletal configurations of diverse characters. In this paper, we present OmniSkel, a novel framework that can directly generate high-quality human motions for any user-defined skeleton without retargeting. Specifically, we introduce skeleton-aware RVQ-VAE, which utilizes Kinematic Graph Cross Attention (K-GCA) to effectively integrate skeletal information into the motion encoding and reconstruction. Moreover, we propose a simple yet effective training-free approach, Motion Restoration Optimizer (MRO), to ensure zero bone length error while preserving motion smoothness. To facilitate our research, we construct SkeleMotion-3D, a large-scale text-skeleton-motion dataset based on HumanML3D. Extensive experiments demonstrate the excellent robustness and generalization of our method.

Cite

Text

Liu et al. "Text-to-Any-Skeleton Motion Generation Without Retargeting." International Conference on Computer Vision, 2025.

Markdown

[Liu et al. "Text-to-Any-Skeleton Motion Generation Without Retargeting." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/liu2025iccv-texttoanyskeleton/)

BibTeX

@inproceedings{liu2025iccv-texttoanyskeleton,
  title     = {{Text-to-Any-Skeleton Motion Generation Without Retargeting}},
  author    = {Liu, Qingyuan and Lv, Ke and Dong, Kun and Xue, Jian and Niu, Zehai and Wang, Jinbao},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {12926-12936},
  url       = {https://mlanthology.org/iccv/2025/liu2025iccv-texttoanyskeleton/}
}