TurboVSR: Fantastic Video Upscalers and Where to Find Them

Wang, Zhongdao; Zhao, Guodongfang; Ren, Jingjing; Feng, Bailan; Zhang, Shifeng; Li, Wenbo

TurboVSR: Fantastic Video Upscalers and Where to Find Them

Zhongdao Wang, Guodongfang Zhao, Jingjing Ren, Bailan Feng, Shifeng Zhang, Wenbo Li

ICCV 2025 pp. 18132-18142

/iccv/2025/wang2025iccv-turbovsr/

Abstract

Diffusion-based generative models have demonstrated exceptional promise in the video super-resolution (VSR) task, achieving a substantial advancement in detail generation relative to prior methods. However, these approaches face significant computational efficiency challenges. For instance, current techniques may require tens of minutes to super-resolve a mere 2-second, 1080p video. In this paper, we present TurboVSR, an ultra-efficient diffusion-based video super-resolution model. Our core design comprises three key aspects: (1) We employ an autoencoder with a high compression ratio of 32x32x8 to reduce the number of tokens. (2) Highly compressed latents pose substantial challenges for training. We introduce factorized conditioning to mitigate the learning complexity: we first learn to super-resolve the initial frame; subsequently, we condition the super-resolution of the remaining frames on the high-resolution initial frame and the low-resolution subsequent frames. (3) We convert the pre-trained diffusion model to a shortcut model to enable fewer sampling steps, further accelerating inference. As a result, TurboVSR performs on par with state-of-the-art VSR methods, while being 100+ times faster, taking only 7 seconds to process a 2-second long 1080p video. TurboVSR also supports image resolution by considering image as a one-frame video. Our efficient design makes SR beyond 1080p possible, results on 4K (3648x2048) image SR show surprising fine details.

PDF ICCV Semantic Scholar

Cite

Text

Wang et al. "TurboVSR: Fantastic Video Upscalers and Where to Find Them." International Conference on Computer Vision, 2025.

Markdown

[Wang et al. "TurboVSR: Fantastic Video Upscalers and Where to Find Them." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/wang2025iccv-turbovsr/)

BibTeX

@inproceedings{wang2025iccv-turbovsr,
  title     = {{TurboVSR: Fantastic Video Upscalers and Where to Find Them}},
  author    = {Wang, Zhongdao and Zhao, Guodongfang and Ren, Jingjing and Feng, Bailan and Zhang, Shifeng and Li, Wenbo},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {18132-18142},
  url       = {https://mlanthology.org/iccv/2025/wang2025iccv-turbovsr/}
}