Concurrent Action Detection with Structural Prediction

Abstract

Action recognition has often been posed as a classification problem, which assumes that a video sequence only have one action class label and different actions are independent. However, a single human body can perform multiple concurrent actions at the same time, and different actions interact with each other. This paper proposes a concurrent action detection model where the action detection is formulated as a structural prediction problem. In this model, an interval in a video sequence can be described by multiple action labels. An detected action interval is determined both by the unary local detector and the relations with other actions. We use a wavelet feature to represent the action sequence, and design a composite temporal logic descriptor to describe the action relations. The model parameters are trained by structural SVM learning. Given a long video sequence, a sequential decision window search algorithm is designed to detect the actions. Experiments on our new collected concurrent action dataset demonstrate the strength of our method.

Cite

Text

Wei et al. "Concurrent Action Detection with Structural Prediction." International Conference on Computer Vision, 2013. doi:10.1109/ICCV.2013.389

Markdown

[Wei et al. "Concurrent Action Detection with Structural Prediction." International Conference on Computer Vision, 2013.](https://mlanthology.org/iccv/2013/wei2013iccv-concurrent/) doi:10.1109/ICCV.2013.389

BibTeX

@inproceedings{wei2013iccv-concurrent,
  title     = {{Concurrent Action Detection with Structural Prediction}},
  author    = {Wei, Ping and Zheng, Nanning and Zhao, Yibiao and Zhu, Song-Chun},
  booktitle = {International Conference on Computer Vision},
  year      = {2013},
  doi       = {10.1109/ICCV.2013.389},
  url       = {https://mlanthology.org/iccv/2013/wei2013iccv-concurrent/}
}