Finding Time Together: Detection and Classification of Focused Interaction in Egocentric Video

Abstract

Focused interaction occurs when co-present individuals, having mutual focus of attention, interact by establishing face-to-face engagement and direct conversation. Face-toface engagement is often not maintained throughout the entirety of a focused interaction. In this paper, we present an online method for automatic classification of unconstrained egocentric (first-person perspective) videos into segments having no focused interaction, focused interaction when the camera wearer is stationary and focused interaction when the camera wearer is moving. We extract features from both audio and video data streams and perform temporal segmentation by using support vector machines with linear and non-linear kernels. We provide empirical evidence that fusion of visual face track scores, camera motion profile and audio voice activity scores is an effective combination for focused interaction classification.

Cite

Text

Bano et al. "Finding Time Together: Detection and Classification of Focused Interaction in Egocentric Video." IEEE/CVF International Conference on Computer Vision Workshops, 2017. doi:10.1109/ICCVW.2017.274

Markdown

[Bano et al. "Finding Time Together: Detection and Classification of Focused Interaction in Egocentric Video." IEEE/CVF International Conference on Computer Vision Workshops, 2017.](https://mlanthology.org/iccvw/2017/bano2017iccvw-finding/) doi:10.1109/ICCVW.2017.274

BibTeX

@inproceedings{bano2017iccvw-finding,
  title     = {{Finding Time Together: Detection and Classification of Focused Interaction in Egocentric Video}},
  author    = {Bano, Sophia and Zhang, Jianguo and McKenna, Stephen J.},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2017},
  pages     = {2322-2330},
  doi       = {10.1109/ICCVW.2017.274},
  url       = {https://mlanthology.org/iccvw/2017/bano2017iccvw-finding/}
}