Channel Attention Is All You Need for Video Frame Interpolation
Abstract
Prevailing video frame interpolation techniques rely heavily on optical flow estimation and require additional model complexity and computational cost; it is also susceptible to error propagation in challenging scenarios with large motion and heavy occlusion. To alleviate the limitation, we propose a simple but effective deep neural network for video frame interpolation, which is end-to-end trainable and is free from a motion estimation network component. Our algorithm employs a special feature reshaping operation, referred to as PixelShuffle, with a channel attention, which replaces the optical flow computation module. The main idea behind the design is to distribute the information in a feature map into multiple channels and extract motion information by attending the channels for pixel-level frame synthesis. The model given by this principle turns out to be effective in the presence of challenging motion and occlusion. We construct a comprehensive evaluation benchmark and demonstrate that the proposed approach achieves outstanding performance compared to the existing models with a component for optical flow computation.
Cite
Text
Choi et al. "Channel Attention Is All You Need for Video Frame Interpolation." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I07.6693Markdown
[Choi et al. "Channel Attention Is All You Need for Video Frame Interpolation." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/choi2020aaai-channel/) doi:10.1609/AAAI.V34I07.6693BibTeX
@inproceedings{choi2020aaai-channel,
title = {{Channel Attention Is All You Need for Video Frame Interpolation}},
author = {Choi, Myungsub and Kim, Heewon and Han, Bohyung and Xu, Ning and Lee, Kyoung Mu},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2020},
pages = {10663-10671},
doi = {10.1609/AAAI.V34I07.6693},
url = {https://mlanthology.org/aaai/2020/choi2020aaai-channel/}
}