SILK: Smooth InterpoLation frameworK for Motion In-Betweening

Abstract

Motion in-betweening is a crucial tool for animators, enabling intricate control over pose-level details in each keyframe. Recent machine learning solutions for motion in-betweening rely on complex models, incorporating skeleton-aware architectures or requiring multiple modules and training steps. In this work, we introduce a simple yet effective Transformer-based framework, employing a single Transformer encoder to synthesize realistic motions in motion in-betweening tasks. We find that data modeling choices play a significant role in improving in-betweening performance. Among others, we show that increasing data volume can yield equivalent or improved motion transitions, that the choice of pose representation is vital for achieving high-quality results, and that incorporating velocity input features enhances animation performance. These findings challenge the assumption that model complexity is the primary determinant of animation quality and provide insights into a more data-centric approach to motion interpolation. Additional videos and supplementary material are available at https://silk-paper.github.io.

Cite

Text

Akhoundi et al. "SILK: Smooth InterpoLation frameworK for Motion In-Betweening." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.

Markdown

[Akhoundi et al. "SILK: Smooth InterpoLation frameworK for Motion In-Betweening." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.](https://mlanthology.org/cvprw/2025/akhoundi2025cvprw-silk/)

BibTeX

@inproceedings{akhoundi2025cvprw-silk,
  title     = {{SILK: Smooth InterpoLation frameworK for Motion In-Betweening}},
  author    = {Akhoundi, Elly and Ling, Hung Yu and Deshmukh, Anup Anand and Bütepage, Judith},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2025},
  pages     = {2900-2909},
  url       = {https://mlanthology.org/cvprw/2025/akhoundi2025cvprw-silk/}
}