Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations

Abstract

Multi-label activity recognition is designed for recognizing multiple activities that are performed simultaneously or sequentially in each video. Most recent activity recognition networks focus on single-activities, that assume only one activity in each video. These networks extract shared features for all the activities, which are not designed for multi-label activities. We introduce an approach to multi-label activity recognition that extracts independent feature descriptors for each activity and learns activity correlations. This structure can be trained end-to-end and plugged into any existing network structures for video classification. Our method outperformed state-of-the-art approaches on four multi-label activity recognition datasets. To better understand the activity-specific features that the system generated, we visualized these activity-specific features in the Charades dataset. The code will be released later.

Cite

Text

Zhang et al. "Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.01439

Markdown

[Zhang et al. "Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/zhang2021cvpr-multilabel/) doi:10.1109/CVPR46437.2021.01439

BibTeX

@inproceedings{zhang2021cvpr-multilabel,
  title     = {{Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations}},
  author    = {Zhang, Yanyi and Li, Xinyu and Marsic, Ivan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {14625-14635},
  doi       = {10.1109/CVPR46437.2021.01439},
  url       = {https://mlanthology.org/cvpr/2021/zhang2021cvpr-multilabel/}
}