Extreme Low Resolution Action Recognition with Spatial-Temporal Multi-Head Self-Attention and Knowledge Distillation
Abstract
This paper proposes a two-stream network with a novel spatial-temporal multi-head self-attention mechanism for action recognition in extreme low resolution (LR) videos. The new approach first utilizes a super resolution (SR) mechanism to provide better visual information to facilitate the network training. To provide more discriminative spatio-temporal features, a knowledge distillation scheme that consists of teacher and student models is employed to enhance the network model using the knowledge from a high resolution (HR) model. Moreover, the two-stream network is combined with a new spatial-temporal multi-head self-attention network to efficaciously learn the long-term temporal dependency. Simulations demonstrate that the proposed method surpasses the state-of-the-art works for extreme LR action recognition on two widespread HMDB-51 and IXMAS datasets.
Cite
Text
Purwanto et al. "Extreme Low Resolution Action Recognition with Spatial-Temporal Multi-Head Self-Attention and Knowledge Distillation." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00125Markdown
[Purwanto et al. "Extreme Low Resolution Action Recognition with Spatial-Temporal Multi-Head Self-Attention and Knowledge Distillation." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/purwanto2019iccvw-extreme/) doi:10.1109/ICCVW.2019.00125BibTeX
@inproceedings{purwanto2019iccvw-extreme,
title = {{Extreme Low Resolution Action Recognition with Spatial-Temporal Multi-Head Self-Attention and Knowledge Distillation}},
author = {Purwanto, Didik and Pramono, Rizard Renanda Adhi and Chen, Yie-Tarng and Fang, Wen-Hsien},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2019},
pages = {961-969},
doi = {10.1109/ICCVW.2019.00125},
url = {https://mlanthology.org/iccvw/2019/purwanto2019iccvw-extreme/}
}