Give Ear to My Face: Modelling Multimodal Attention to Social Interactions

Abstract

We address the deployment of perceptual attention to social interactions as displayed in conversational clips, when relying on multimodal information (audio and video). A probabilistic modelling framework is proposed that goes beyond the classic saliency paradigm while integrating multiple information cues. Attentional allocation is determined not just by stimulus-driven selection but, importantly, by social value as modulating the selection history of relevant multimodal items. Thus, the construction of attentional priority is the result of a sampling procedure conditioned on the potential value dynamics of socially relevant objects emerging moment to moment within the scene. Preliminary experiments on a publicly available dataset are presented.

Cite

Text

Boccignone et al. "Give Ear to My Face: Modelling Multimodal Attention to Social Interactions." European Conference on Computer Vision Workshops, 2018. doi:10.1007/978-3-030-11012-3_27

Markdown

[Boccignone et al. "Give Ear to My Face: Modelling Multimodal Attention to Social Interactions." European Conference on Computer Vision Workshops, 2018.](https://mlanthology.org/eccvw/2018/boccignone2018eccvw-give/) doi:10.1007/978-3-030-11012-3_27

BibTeX

@inproceedings{boccignone2018eccvw-give,
  title     = {{Give Ear to My Face: Modelling Multimodal Attention to Social Interactions}},
  author    = {Boccignone, Giuseppe and Cuculo, Vittorio and D'Amelio, Alessandro and Grossi, Giuliano and Lanzarotti, Raffaella},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2018},
  pages     = {331-345},
  doi       = {10.1007/978-3-030-11012-3_27},
  url       = {https://mlanthology.org/eccvw/2018/boccignone2018eccvw-give/}
}