Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors

Abstract

Arbitrary-scale video super-resolution (AVSR) aims to enhance the resolution of video frames, potentially at various scaling factors, which presents several challenges regarding spatial detail reproduction, temporal consistency, and computational complexity. In this paper, we first describe a strong baseline for AVSR by putting together three variants of elementary building blocks: 1) a flow-guided recurrent unit that aggregates spatiotemporal information from previous frames, 2) a flow-refined cross-attention unit that selects spatiotemporal information from future frames, and 3) a hyper-upsampling unit that generates scale-aware and content-independent upsampling kernels. We then introduce ST-AVSR by equipping our baseline with a multi-scale structural and textural prior computed from the pre-trained VGG network. This prior has proven effective in discriminating structure and texture across different locations and scales, which is beneficial for AVSR. Comprehensive experiments show that ST-AVSR significantly improves super-resolution quality, generalization ability, and inference speed over the state-of-the-art. The code is available at https://github.com/shangwei5/ ST-AVSR.

Cite

Text

Shang et al. "Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72998-0_5

Markdown

[Shang et al. "Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/shang2024eccv-arbitraryscale/) doi:10.1007/978-3-031-72998-0_5

BibTeX

@inproceedings{shang2024eccv-arbitraryscale,
  title     = {{Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors}},
  author    = {Shang, Wei and Ren, Dongwei and Zhang, Wanying and Fang, Yuming and Zuo, Wangmeng and Ma, Kede},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72998-0_5},
  url       = {https://mlanthology.org/eccv/2024/shang2024eccv-arbitraryscale/}
}