Human Action Adverb Recognition: ADHA Dataset and a Three-Stream Hybrid Model

Abstract

We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labelling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results reveal that such methods are unsatisfactory. Furthermore, we propose a novel three-stream hybrid model to tackle the HAA problem, which achieves better performances and receives relatively promising results.

Cite

Text

Pang et al. "Human Action Adverb Recognition: ADHA Dataset and a Three-Stream Hybrid Model." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018. doi:10.1109/CVPRW.2018.00308

Markdown

[Pang et al. "Human Action Adverb Recognition: ADHA Dataset and a Three-Stream Hybrid Model." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018.](https://mlanthology.org/cvprw/2018/pang2018cvprw-human/) doi:10.1109/CVPRW.2018.00308

BibTeX

@inproceedings{pang2018cvprw-human,
  title     = {{Human Action Adverb Recognition: ADHA Dataset and a Three-Stream Hybrid Model}},
  author    = {Pang, Bo and Zha, Kaiwen and Lu, Cewu},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2018},
  pages     = {2325-2334},
  doi       = {10.1109/CVPRW.2018.00308},
  url       = {https://mlanthology.org/cvprw/2018/pang2018cvprw-human/}
}