Discovering Human Interactions in Videos with Limited Data Labeling

Abstract

We present a novel approach for discovering human interactions in videos. Activity understanding techniques usually require a large number of labeled examples, which are not available in many practical cases. Here, we focus on recovering semantically meaningful clusters of human-human and human-object interaction in an unsupervised fashion. A new iterative solution is introduced based on Maximum Margin Clustering (MMC), which also accepts user feedback to refine clusters. This is achieved by formulating the whole process as a unified constrained latent max-margin clustering problem. Extensive experiments have been carried out over three challenging datasets, Collective Activity, VIRAT, and UT-interaction. Empirical results demonstrate that the proposed algorithm can efficiently discover perfect semantic clusters of human interactions with only a small amount of labeling effort.

Cite

Text

Khodabandeh et al. "Discovering Human Interactions in Videos with Limited Data Labeling." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2015. doi:10.1109/CVPRW.2015.7301278

Markdown

[Khodabandeh et al. "Discovering Human Interactions in Videos with Limited Data Labeling." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2015.](https://mlanthology.org/cvprw/2015/khodabandeh2015cvprw-discovering/) doi:10.1109/CVPRW.2015.7301278

BibTeX

@inproceedings{khodabandeh2015cvprw-discovering,
  title     = {{Discovering Human Interactions in Videos with Limited Data Labeling}},
  author    = {Khodabandeh, Mehran and Vahdat, Arash and Zhou, Guang-Tong and Hajimirsadeghi, Hossein and Roshtkhari, Mehrsan Javan and Mori, Greg and Se, Stephen},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2015},
  pages     = {9-18},
  doi       = {10.1109/CVPRW.2015.7301278},
  url       = {https://mlanthology.org/cvprw/2015/khodabandeh2015cvprw-discovering/}
}