ACTION-Net: Multipath Excitation for Action Recognition

Wang, Zhengwei; She, Qi; Smolic, Aljosa

doi:10.1109/CVPR46437.2021.01301

ACTION-Net: Multipath Excitation for Action Recognition

Zhengwei Wang, Qi She, Aljosa Smolic

CVPR 2021 pp. 13214-13223

doi:10.1109/CVPR46437.2021.01301 /cvpr/2021/wang2021cvpr-actionnet/

Abstract

Spatial-temporal, channel-wise, and motion patterns are three complementary and crucial types of information for video action recognition. Conventional 2D CNNs are computationally cheap but cannot catch temporal relationships; 3D CNNs can achieve good performance but are computationally intensive. In this work, we tackle this dilemma by designing a generic and effective module that can be embedded into 2D CNNs. To this end, we propose a spAtio-temporal, Channel and moTion excitatION (ACTION) module consisting of three paths: Spatio-Temporal Excitation (STE) path, Channel Excitation (CE) path, and Motion Excitation (ME) path. The STE path employs one channel 3D convolution to characterize spatio-temporal representation. The CE path adaptively recalibrates channel-wise feature responses by explicitly modeling interdependencies between channels in terms of the temporal aspect. The ME path calculates feature-level temporal differences, which is then utilized to excite motion-sensitive channels. We equip 2D CNNs with the proposed ACTION module to form a simple yet effective ACTION-Net with very limited extra computational cost. ACTION-Net is demonstrated by consistently outperforming 2D CNN counterparts on three backbones (i.e., ResNet-50, MobileNet V2 and BNInception) employing three datasets (i.e., Something-Something V2, Jester, and EgoGesture). Code is provided at https://github.com/V-Sense/ACTION-Net.

PDF CVPR Semantic Scholar

Cite

Text

Wang et al. "ACTION-Net: Multipath Excitation for Action Recognition." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.01301

Markdown

[Wang et al. "ACTION-Net: Multipath Excitation for Action Recognition." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/wang2021cvpr-actionnet/) doi:10.1109/CVPR46437.2021.01301

BibTeX

@inproceedings{wang2021cvpr-actionnet,
  title     = {{ACTION-Net: Multipath Excitation for Action Recognition}},
  author    = {Wang, Zhengwei and She, Qi and Smolic, Aljosa},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {13214-13223},
  doi       = {10.1109/CVPR46437.2021.01301},
  url       = {https://mlanthology.org/cvpr/2021/wang2021cvpr-actionnet/}
}