Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection
Abstract
Despite significant progress in the development of human action detection datasets and algorithms, no current dataset is representative of real-world aerial view scenarios. We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 12 action classes. Okutama-Action features many challenges missing in current datasets, including dynamic transition of actions, significant changes in scale and aspect ratio, abrupt camera movement, as well as multi-labeled actors. As a result, our dataset is more challenging than existing ones, and will help push the field forward to enable real-world applications.
Cite
Text
Barekatain et al. "Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2017. doi:10.1109/CVPRW.2017.267Markdown
[Barekatain et al. "Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2017.](https://mlanthology.org/cvprw/2017/barekatain2017cvprw-okutamaaction/) doi:10.1109/CVPRW.2017.267BibTeX
@inproceedings{barekatain2017cvprw-okutamaaction,
title = {{Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection}},
author = {Barekatain, Mohammadamin and Martí, Miquel and Shih, Hsueh-Fu and Murray, Samuel and Nakayama, Kotaro and Matsuo, Yutaka and Prendinger, Helmut},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2017},
pages = {2153-2160},
doi = {10.1109/CVPRW.2017.267},
url = {https://mlanthology.org/cvprw/2017/barekatain2017cvprw-okutamaaction/}
}