AnyMoLe: Any Character Motion In-Betweening Leveraging Video Diffusion Models

Abstract

Despite recent advancements in learning-based motion in-betweening, a key limitation has been overlooked: the requirement for character-specific datasets. In this work, we introduce AnyMoLe, a novel method that addresses this limitation by leveraging video diffusion models to generate motion in-between frames for arbitrary characters without external data. Our approach employs a two-stage frame generation process to enhance contextual understanding. Furthermore, to bridge the domain gap between real-world and rendered character animations, we introduce ICAdapt, a fine-tuning technique for video diffusion models. Additionally, we propose a "motion-video mimicking" optimization technique, enabling seamless motion generation for characters with arbitrary joint structures using 2D and 3D-aware features. AnyMoLe significantly reduces data dependency while generating smooth and realistic transitions, making it applicable to a wide range of motion in-betweening tasks.

Cite

Text

Yun et al. "AnyMoLe: Any Character Motion In-Betweening Leveraging Video Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02592

Markdown

[Yun et al. "AnyMoLe: Any Character Motion In-Betweening Leveraging Video Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/yun2025cvpr-anymole/) doi:10.1109/CVPR52734.2025.02592

BibTeX

@inproceedings{yun2025cvpr-anymole,
  title     = {{AnyMoLe: Any Character Motion In-Betweening Leveraging Video Diffusion Models}},
  author    = {Yun, Kwan and Hong, Seokhyeon and Kim, Chaelin and Noh, Junyong},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {27838-27848},
  doi       = {10.1109/CVPR52734.2025.02592},
  url       = {https://mlanthology.org/cvpr/2025/yun2025cvpr-anymole/}
}