Learning Human Interaction by Interactive Phrases

Abstract

In this paper, we present a novel approach for human interaction recognition from videos. We introduce high-level descriptions called interactive phrases to express binary semantic motion relationships between interacting people. Interactive phrases naturally exploit human knowledge to describe interactions and allow us to construct a more descriptive model for recognizing human interactions. We propose a novel hierarchical model to encode interactive phrases based on the latent SVM framework where interactive phrases are treated as latent variables. The interdependencies between interactive phrases are explicitly captured in the model to deal with motion ambiguity and partial occlusion in interactions. We evaluate our method on a newly collected BIT-Interaction dataset and UT-Interaction dataset. Promising results demonstrate the effectiveness of the proposed method.

Cite

Text

Kong et al. "Learning Human Interaction by Interactive Phrases." European Conference on Computer Vision, 2012. doi:10.1007/978-3-642-33718-5_22

Markdown

[Kong et al. "Learning Human Interaction by Interactive Phrases." European Conference on Computer Vision, 2012.](https://mlanthology.org/eccv/2012/kong2012eccv-learning/) doi:10.1007/978-3-642-33718-5_22

BibTeX

@inproceedings{kong2012eccv-learning,
  title     = {{Learning Human Interaction by Interactive Phrases}},
  author    = {Kong, Yu and Jia, Yunde and Fu, Yun},
  booktitle = {European Conference on Computer Vision},
  year      = {2012},
  pages     = {300-313},
  doi       = {10.1007/978-3-642-33718-5_22},
  url       = {https://mlanthology.org/eccv/2012/kong2012eccv-learning/}
}