Adaptive Caching for Faster Video Generation with Diffusion Transformers
Abstract
Generating temporally-consistent high-fidelity videos can be computationally expensive, especially over longer temporal spans. More-recent Diffusion Transformers (DiTs)--- despite making significant headway in this context--- have only heightened such challenges as they rely on larger models and heavier attention mechanisms, resulting in slower inference speeds. In this paper, we introduce a method to accelerate video DiTs, termed Adaptive Caching (AdaCache), which is motivated by the fact that 'not all videos are created equal': meaning, some videos require fewer denoising steps to attain a reasonable quality than others. Building on this, we not only cache computations through the diffusion process, but also devise a caching schedule tailored to each video generation, maximizing the quality-latency trade-off. We further introduce a Motion Regularization (MoReg) scheme to utilize video information within AdaCache, essentially controlling the compute allocation based on motion content. Altogether, our plug-and-play contributions grant significant inference speedups (e.g. up to 4.7x on Open-Sora 720p - 2s video generation) without sacrificing the generation quality, across multiple video DiT baselines.
Cite
Text
Kahatapitiya et al. "Adaptive Caching for Faster Video Generation with Diffusion Transformers." International Conference on Computer Vision, 2025.Markdown
[Kahatapitiya et al. "Adaptive Caching for Faster Video Generation with Diffusion Transformers." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/kahatapitiya2025iccv-adaptive/)BibTeX
@inproceedings{kahatapitiya2025iccv-adaptive,
title = {{Adaptive Caching for Faster Video Generation with Diffusion Transformers}},
author = {Kahatapitiya, Kumara and Liu, Haozhe and He, Sen and Liu, Ding and Jia, Menglin and Zhang, Chenyang and Ryoo, Michael S. and Xie, Tian},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {15240-15252},
url = {https://mlanthology.org/iccv/2025/kahatapitiya2025iccv-adaptive/}
}