Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection

Abstract

Despite significant progress in the development of human action detection datasets and algorithms, no current dataset is representative of real-world aerial view scenarios. We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 12 action classes. Okutama-Action features many challenges missing in current datasets, including dynamic transition of actions, significant changes in scale and aspect ratio, abrupt camera movement, as well as multi-labeled actors. As a result, our dataset is more challenging than existing ones, and will help push the field forward to enable real-world applications.

Cite

Text

Barekatain et al. "Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2017. doi:10.1109/CVPRW.2017.267

Markdown

[Barekatain et al. "Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2017.](https://mlanthology.org/cvprw/2017/barekatain2017cvprw-okutamaaction/) doi:10.1109/CVPRW.2017.267

BibTeX

@inproceedings{barekatain2017cvprw-okutamaaction,
  title     = {{Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection}},
  author    = {Barekatain, Mohammadamin and Martí, Miquel and Shih, Hsueh-Fu and Murray, Samuel and Nakayama, Kotaro and Matsuo, Yutaka and Prendinger, Helmut},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2017},
  pages     = {2153-2160},
  doi       = {10.1109/CVPRW.2017.267},
  url       = {https://mlanthology.org/cvprw/2017/barekatain2017cvprw-okutamaaction/}
}