A Simple Baseline for Video Restoration with Grouped Spatial-Temporal Shift

Abstract

Video restoration, which aims to restore clear frames from degraded videos, has numerous important applications. The key to video restoration depends on utilizing inter-frame information. However, existing deep learning methods often rely on complicated network architectures, such as optical flow estimation, deformable convolution, and cross-frame self-attention layers, resulting in high computational costs. In this study, we propose a simple yet effective framework for video restoration. Our approach is based on grouped spatial-temporal shift, which is a lightweight and straightforward technique that can implicitly capture inter-frame correspondences for multi-frame aggregation. By introducing grouped spatial shift, we attain expansive effective receptive fields. Combined with basic 2D convolution, this simple framework can effectively aggregate inter-frame information. Extensive experiments demonstrate that our framework outperforms the previous state-of-the-art method, while using less than a quarter of its computational cost, on both video deblurring and video denoising tasks. These results indicate the potential for our approach to significantly reduce computational overhead while maintaining high-quality results. Code is avaliable at https://github.com/dasongli1/Shift-Net.

Cite

Text

Li et al. "A Simple Baseline for Video Restoration with Grouped Spatial-Temporal Shift." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00947

Markdown

[Li et al. "A Simple Baseline for Video Restoration with Grouped Spatial-Temporal Shift." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/li2023cvpr-simple/) doi:10.1109/CVPR52729.2023.00947

BibTeX

@inproceedings{li2023cvpr-simple,
  title     = {{A Simple Baseline for Video Restoration with Grouped Spatial-Temporal Shift}},
  author    = {Li, Dasong and Shi, Xiaoyu and Zhang, Yi and Cheung, Ka Chun and See, Simon and Wang, Xiaogang and Qin, Hongwei and Li, Hongsheng},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {9822-9832},
  doi       = {10.1109/CVPR52729.2023.00947},
  url       = {https://mlanthology.org/cvpr/2023/li2023cvpr-simple/}
}