Building an End-to-End Spatial-Temporal Convolutional Network for Video Super-Resolution
Abstract
We propose an end-to-end deep network for video super-resolution. Our network is composed of a spatial component that encodes intra-frame visual patterns, a temporal component that discovers inter-frame relations, and a reconstruction component that aggregates information to predict details. We make the spatial component deep, so that it can better leverage spatial redundancies for rebuilding high-frequency structures. We organize the temporal component in a bidirectional and multi-scale fashion, to better capture how frames change across time. The effectiveness of the proposed approach is highlighted on two datasets, where we observe substantial improvements relative to the state of the arts.
Cite
Text
Guo and Chao. "Building an End-to-End Spatial-Temporal Convolutional Network for Video Super-Resolution." AAAI Conference on Artificial Intelligence, 2017. doi:10.1609/AAAI.V31I1.11228Markdown
[Guo and Chao. "Building an End-to-End Spatial-Temporal Convolutional Network for Video Super-Resolution." AAAI Conference on Artificial Intelligence, 2017.](https://mlanthology.org/aaai/2017/guo2017aaai-building/) doi:10.1609/AAAI.V31I1.11228BibTeX
@inproceedings{guo2017aaai-building,
title = {{Building an End-to-End Spatial-Temporal Convolutional Network for Video Super-Resolution}},
author = {Guo, Jun and Chao, Hongyang},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2017},
pages = {4053-4060},
doi = {10.1609/AAAI.V31I1.11228},
url = {https://mlanthology.org/aaai/2017/guo2017aaai-building/}
}