Two-Stream Flow-Guided Convolutional Attention Networks for Action Recognition

Abstract

This paper proposes a two-stream flow-guided convolutional attention networks for action recognition in videos. The central idea is that optical flows, when properly compensated for the camera motion, can be used to guide attention to the human foreground. We thus develop crosslink layers from the temporal network (trained on flows) to the spatial network (trained on RGB frames). These crosslink layers guide the spatial-stream to pay more attention to the human foreground areas and be less affected by background clutter. We obtain promising performances with our approach on the UCF101, HMDB51 and Hollywood2 datasets.

Cite

Text

Tran and Cheong. "Two-Stream Flow-Guided Convolutional Attention Networks for Action Recognition." IEEE/CVF International Conference on Computer Vision Workshops, 2017. doi:10.1109/ICCVW.2017.368

Markdown

[Tran and Cheong. "Two-Stream Flow-Guided Convolutional Attention Networks for Action Recognition." IEEE/CVF International Conference on Computer Vision Workshops, 2017.](https://mlanthology.org/iccvw/2017/tran2017iccvw-twostream/) doi:10.1109/ICCVW.2017.368

BibTeX

@inproceedings{tran2017iccvw-twostream,
  title     = {{Two-Stream Flow-Guided Convolutional Attention Networks for Action Recognition}},
  author    = {Tran, An and Cheong, Loong-Fah},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2017},
  pages     = {3110-3119},
  doi       = {10.1109/ICCVW.2017.368},
  url       = {https://mlanthology.org/iccvw/2017/tran2017iccvw-twostream/}
}