Articulated Kinematics Distillation from Video Diffusion Models

Abstract

We present Articulated Kinematics Distillation (AKD), a framework for generating high-fidelity character animations by merging the strengths of skeleton-based animation and modern generative models. AKD uses a skeleton-based representation for rigged 3D assets, drastically reducing the Degrees of Freedom (DoFs) by focusing on joint-level control, which allows for efficient, consistent motion synthesis. Through Score Distillation Sampling (SDS) with pre-trained video diffusion models, AKD distills complex, articulated motions while maintaining structural integrity, overcoming challenges faced by 4D neural deformation fields in preserving shape consistency. This approach is naturally compatible with physics-based simulation, ensuring physically plausible interactions. Experiments show that AKD achieves superior 3D consistency and motion quality compared with existing works on text-to-4D generation.

Cite

Text

Li et al. "Articulated Kinematics Distillation from Video Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01637

Markdown

[Li et al. "Articulated Kinematics Distillation from Video Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/li2025cvpr-articulated/) doi:10.1109/CVPR52734.2025.01637

BibTeX

@inproceedings{li2025cvpr-articulated,
  title     = {{Articulated Kinematics Distillation from Video Diffusion Models}},
  author    = {Li, Xuan and Ma, Qianli and Lin, Tsung-Yi and Chen, Yongxin and Jiang, Chenfanfu and Liu, Ming-Yu and Xiang, Donglai},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {17571-17581},
  doi       = {10.1109/CVPR52734.2025.01637},
  url       = {https://mlanthology.org/cvpr/2025/li2025cvpr-articulated/}
}