Action Recognition with Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion

Abstract

Action recognition is an important yet challenging task in computer vision. In this paper, we propose a novel deep-based framework for action recognition, which improves the recognition accuracy by: 1) deriving more precise features for representing actions, and 2) reducing the asynchrony between different information streams. We first introduce a coarse-to-fine network which extracts shared deep features at different action class granularities and progressively integrates them to obtain a more accurate feature representation for input actions. We further introduce an asynchronous fusion network. It fuses information from different streams by asynchronously integrating stream-wise features at different time points, hence better leveraging the complementary information in different streams. Experimental results on action recognition benchmarks demonstrate that our approach achieves the state-of-the-art performance.

Cite

Text

Lin et al. "Action Recognition with Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.12232

Markdown

[Lin et al. "Action Recognition with Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/lin2018aaai-action/) doi:10.1609/AAAI.V32I1.12232

BibTeX

@inproceedings{lin2018aaai-action,
  title     = {{Action Recognition with Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion}},
  author    = {Lin, Weiyao and Zhang, Chongyang and Lu, Ke and Sheng, Bin and Wu, Jianxin and Ni, Bingbing and Liu, Xin and Xiong, Hongkai},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {7130-7137},
  doi       = {10.1609/AAAI.V32I1.12232},
  url       = {https://mlanthology.org/aaai/2018/lin2018aaai-action/}
}