stagNet: An Attentive Semantic RNN for Group Activity Recognition
Abstract
Group activity recognition plays a fundamental role in a variety of applications, e.g. sports video analysis and intelligent surveillance. How to model the spatio-temporal contextual information in a scene still remains a crucial yet challenging issue. We propose a novel attentive semantic recurrent neural network (RNN), namely stagNet, for understanding group activities in videos, based on the spatio-temporal attention and semantic graph. A semantic graph is explicitly modeled to describe the spatial context of the whole scene, which is further integrated with the temporal factor via structural-RNN. Benefiting from the 'factor sharing' and 'message passing' mechanisms, our model is able to extract discriminative spatio-temporal features and to capture inter-group relationships. Moreover, we adopt a spatio-temporal attention model to attend to key persons/frames for improved performance. Two widely-used datasets are employed for performance evaluation, and the extensive results demonstrate the superiority of our method.
Cite
Text
Qi et al. "stagNet: An Attentive Semantic RNN for Group Activity Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2018. doi:10.1007/978-3-030-01249-6_7Markdown
[Qi et al. "stagNet: An Attentive Semantic RNN for Group Activity Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2018.](https://mlanthology.org/eccv/2018/qi2018eccv-stagnet/) doi:10.1007/978-3-030-01249-6_7BibTeX
@inproceedings{qi2018eccv-stagnet,
title = {{stagNet: An Attentive Semantic RNN for Group Activity Recognition}},
author = {Qi, Mengshi and Qin, Jie and Li, Annan and Wang, Yunhong and Luo, Jiebo and Van Gool, Luc},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2018},
doi = {10.1007/978-3-030-01249-6_7},
url = {https://mlanthology.org/eccv/2018/qi2018eccv-stagnet/}
}