Spatio-Temporal Transformer Network for Video Restoration
Abstract
State-of-the-art video restoration methods integrate optical flow estimation networks to utilize temporal information. However, these networks typically consider only a pair of consecutive frames and hence are not capable of capturing long-range temporal dependencies and fall short of establishing correspondences across several timesteps. To alleviate these problems, we propose a novel Spatio-temporal Transformer Network (STTN) which handles multiple frames at once and thereby manages to mitigate the common nuisance of occlusions in optical flow estimation. Our proposed STTN comprises a module that estimates optical flow in both space and time and a resampling layer that selectively warps target frames using the estimated flow. In our experiments, we demonstrate the efficiency of the proposed network and show state-of-the-art restoration results in video super-resolution and video deblurring.
Cite
Text
Hyun Kim et al. "Spatio-Temporal Transformer Network for Video Restoration." Proceedings of the European Conference on Computer Vision (ECCV), 2018. doi:10.1007/978-3-030-01219-9_7Markdown
[Hyun Kim et al. "Spatio-Temporal Transformer Network for Video Restoration." Proceedings of the European Conference on Computer Vision (ECCV), 2018.](https://mlanthology.org/eccv/2018/hyunkim2018eccv-spatiotemporal/) doi:10.1007/978-3-030-01219-9_7BibTeX
@inproceedings{hyunkim2018eccv-spatiotemporal,
title = {{Spatio-Temporal Transformer Network for Video Restoration}},
author = {Hyun Kim, Tae and Sajjadi, Mehdi S. M. and Hirsch, Michael and Scholkopf, Bernhard},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2018},
doi = {10.1007/978-3-030-01219-9_7},
url = {https://mlanthology.org/eccv/2018/hyunkim2018eccv-spatiotemporal/}
}