HiRF: Hierarchical Random Field for Collective Activity Recognition in Videos

Abstract

This paper addresses the problem of recognizing and localizing coherent activities of a group of people, called collective activities, in video. Related work has argued the benefits of capturing long-range and higher-order dependencies among video features for robust recognition. To this end, we formulate a new deep model, called Hierarchical Random Field (HiRF). HiRF models only hierarchical dependencies between model variables. This effectively amounts to modeling higher-order temporal dependencies of video features. We specify an efficient inference of HiRF that iterates in each step linear programming for estimating latent variables. Learning of HiRF parameters is specified within the max-margin framework. Our evaluation on the benchmark New Collective Activity and Collective Activity datasets, demonstrates that HiRF yields superior recognition and localization as compared to the state of the art.

Cite

Text

Amer et al. "HiRF: Hierarchical Random Field for Collective Activity Recognition in Videos." European Conference on Computer Vision, 2014. doi:10.1007/978-3-319-10599-4_37

Markdown

[Amer et al. "HiRF: Hierarchical Random Field for Collective Activity Recognition in Videos." European Conference on Computer Vision, 2014.](https://mlanthology.org/eccv/2014/amer2014eccv-hirf/) doi:10.1007/978-3-319-10599-4_37

BibTeX

@inproceedings{amer2014eccv-hirf,
  title     = {{HiRF: Hierarchical Random Field for Collective Activity Recognition in Videos}},
  author    = {Amer, Mohamed Rabie and Lei, Peng and Todorovic, Sinisa},
  booktitle = {European Conference on Computer Vision},
  year      = {2014},
  pages     = {572-585},
  doi       = {10.1007/978-3-319-10599-4_37},
  url       = {https://mlanthology.org/eccv/2014/amer2014eccv-hirf/}
}