Text-to-Any-Skeleton Motion Generation Without Retargeting
Abstract
Recent advances in text-driven motion generation have shown notable advancements. However, these works are typically limited to standardized skeletons and rely on a cumbersome retargeting process to adapt to varying skeletal configurations of diverse characters. In this paper, we present OmniSkel, a novel framework that can directly generate high-quality human motions for any user-defined skeleton without retargeting. Specifically, we introduce skeleton-aware RVQ-VAE, which utilizes Kinematic Graph Cross Attention (K-GCA) to effectively integrate skeletal information into the motion encoding and reconstruction. Moreover, we propose a simple yet effective training-free approach, Motion Restoration Optimizer (MRO), to ensure zero bone length error while preserving motion smoothness. To facilitate our research, we construct SkeleMotion-3D, a large-scale text-skeleton-motion dataset based on HumanML3D. Extensive experiments demonstrate the excellent robustness and generalization of our method.
Cite
Text
Liu et al. "Text-to-Any-Skeleton Motion Generation Without Retargeting." International Conference on Computer Vision, 2025.Markdown
[Liu et al. "Text-to-Any-Skeleton Motion Generation Without Retargeting." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/liu2025iccv-texttoanyskeleton/)BibTeX
@inproceedings{liu2025iccv-texttoanyskeleton,
title = {{Text-to-Any-Skeleton Motion Generation Without Retargeting}},
author = {Liu, Qingyuan and Lv, Ke and Dong, Kun and Xue, Jian and Niu, Zehai and Wang, Jinbao},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {12926-12936},
url = {https://mlanthology.org/iccv/2025/liu2025iccv-texttoanyskeleton/}
}