Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

Abstract

Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video super-resolution have been limited to naive or inefficient architectures. In this paper, we introduce spatio-temporal sub-pixel convolution networks that effectively exploit temporal redundancies and improve reconstruction accuracy while maintaining real-time speed. Specifically, we discuss the use of early fusion, slow fusion and 3D convolutions for the joint processing of multiple consecutive video frames. We also propose a novel joint motion compensation and video super-resolution algorithm that is orders of magnitude more efficient than competing methods, relying on a fast multi-resolution spatial transformer module that is end-to-end trainable. These contributions provide both higher accuracy and temporally more consistent videos, which we confirm qualitatively and quantitatively. Relative to single-frame models, spatio-temporal networks can either reduce the computational cost by 30% whilst maintaining the same quality or provide a 0.2dB gain for a similar computational cost. Results on publicly available datasets demonstrate that the proposed algorithms surpass current state-of-the-art performance in both accuracy and efficiency.

Cite

Text

Caballero et al. "Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation." Conference on Computer Vision and Pattern Recognition, 2017. doi:10.1109/CVPR.2017.304

Markdown

[Caballero et al. "Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation." Conference on Computer Vision and Pattern Recognition, 2017.](https://mlanthology.org/cvpr/2017/caballero2017cvpr-realtime/) doi:10.1109/CVPR.2017.304

BibTeX

@inproceedings{caballero2017cvpr-realtime,
  title     = {{Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation}},
  author    = {Caballero, Jose and Ledig, Christian and Aitken, Andrew and Acosta, Alejandro and Totz, Johannes and Wang, Zehan and Shi, Wenzhe},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2017},
  doi       = {10.1109/CVPR.2017.304},
  url       = {https://mlanthology.org/cvpr/2017/caballero2017cvpr-realtime/}
}