Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
Abstract
Arbitrary-scale video super-resolution (AVSR) aims to enhance the resolution of video frames, potentially at various scaling factors, which presents several challenges regarding spatial detail reproduction, temporal consistency, and computational complexity. In this paper, we first describe a strong baseline for AVSR by putting together three variants of elementary building blocks: 1) a flow-guided recurrent unit that aggregates spatiotemporal information from previous frames, 2) a flow-refined cross-attention unit that selects spatiotemporal information from future frames, and 3) a hyper-upsampling unit that generates scale-aware and content-independent upsampling kernels. We then introduce ST-AVSR by equipping our baseline with a multi-scale structural and textural prior computed from the pre-trained VGG network. This prior has proven effective in discriminating structure and texture across different locations and scales, which is beneficial for AVSR. Comprehensive experiments show that ST-AVSR significantly improves super-resolution quality, generalization ability, and inference speed over the state-of-the-art. The code is available at https://github.com/shangwei5/ ST-AVSR.
Cite
Text
Shang et al. "Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72998-0_5Markdown
[Shang et al. "Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/shang2024eccv-arbitraryscale/) doi:10.1007/978-3-031-72998-0_5BibTeX
@inproceedings{shang2024eccv-arbitraryscale,
title = {{Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors}},
author = {Shang, Wei and Ren, Dongwei and Zhang, Wanying and Fang, Yuming and Zuo, Wangmeng and Ma, Kede},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72998-0_5},
url = {https://mlanthology.org/eccv/2024/shang2024eccv-arbitraryscale/}
}