Categorizing Turn-Taking Interactions

Abstract

We address the problem of categorizing turn-taking interactions between individuals. Social interactions are characterized by turn-taking and arise frequently in real-world videos. Our approach is based on the use of temporal causal analysis to decompose a space-time visual word representation of video into co-occuring independent segments, called causal sets [1]. These causal sets then serves the input to a multiple instance learning framework to categorize turn-taking interactions. We introduce a new turn-taking interactions dataset consisting of social games and sports rallies. We demonstrate that our formulation of multiple instance learning (QP-MISVM) is better able to leverage the repetitive structure in turn-taking interactions and demonstrates superior performance relative to a conventional bag of words model.

Cite

Text

Prabhakar and Rehg. "Categorizing Turn-Taking Interactions." European Conference on Computer Vision, 2012. doi:10.1007/978-3-642-33715-4_28

Markdown

[Prabhakar and Rehg. "Categorizing Turn-Taking Interactions." European Conference on Computer Vision, 2012.](https://mlanthology.org/eccv/2012/prabhakar2012eccv-categorizing/) doi:10.1007/978-3-642-33715-4_28

BibTeX

@inproceedings{prabhakar2012eccv-categorizing,
  title     = {{Categorizing Turn-Taking Interactions}},
  author    = {Prabhakar, Karthir and Rehg, James M.},
  booktitle = {European Conference on Computer Vision},
  year      = {2012},
  pages     = {383-396},
  doi       = {10.1007/978-3-642-33715-4_28},
  url       = {https://mlanthology.org/eccv/2012/prabhakar2012eccv-categorizing/}
}