CDAD: A Common Daily Action Dataset with Collected Hard Negative Samples

Abstract

The research on action understanding has achieved significant progress with the establishment of various benchmark datasets. However, the results of action understanding are far from satisfactory in practice. One reason is that the existing action datasets ignore the existence of many hard negative samples in real-world scenarios, which are usually undefined confusion actions, e.g., holding a pen near the mouth vs. smoking. In this work, we focus on the common actions in our daily life and present a novel Common Daily Action Dataset (CDAD), which consists of 57,824 video clips of 23 well-defined common daily actions with rich manual annotations. Particularly, for each daily action, we collect not only diverse positive samples but also various hard negative samples that have minor differences (share similarities) in action with the positive ones. The established CDAD dataset could not only serve as a benchmark for several important daily action understanding tasks, including multi-label action recognition, temporal action localization, and spatial-temporal action detection, but also provide a testbed for researchers to investigate the influence of highly similar negative samples in learning action understanding models. Datasets and codes are available: https://github.com/MartinXM/CDAD.

Cite

Text

Xiang et al. "CDAD: A Common Daily Action Dataset with Collected Hard Negative Samples." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00437

Markdown

[Xiang et al. "CDAD: A Common Daily Action Dataset with Collected Hard Negative Samples." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/xiang2022cvprw-cdad/) doi:10.1109/CVPRW56347.2022.00437

BibTeX

@inproceedings{xiang2022cvprw-cdad,
  title     = {{CDAD: A Common Daily Action Dataset with Collected Hard Negative Samples}},
  author    = {Xiang, Wangmeng and Li, Chao and Li, Ke and Wang, Biao and Hua, Xian-Sheng and Zhang, Lei},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2022},
  pages     = {3920-3929},
  doi       = {10.1109/CVPRW56347.2022.00437},
  url       = {https://mlanthology.org/cvprw/2022/xiang2022cvprw-cdad/}
}