Multimodal Real-Time Focus of Attention Estimation in SmartRooms

Abstract

This paper presents an overview of our work on real-time multimodal tracking focus of attention of multiple persons in a SmartRoom scenario. Redundancy among cameras is exploited to generate a 3D discrete reconstruction of the space. This information is fed to a novel low complexity Monte Carlo based tracking scheme. Estimated locations of people in the room are used to automatically determine their head positions. Head orientation of every person is computed using video and audio separately and then a multimodal estimation is produced by combining data at feature level employing a decentralized Kalman filter. Finally, participantspsila focus attention is estimated by means of two geometric descriptors: the attention cone and the attention map. Experiments conducted over annotated databases yield quantitative results proving the effectiveness of the presented approach.

Cite

Text

Canton-Ferrer et al. "Multimodal Real-Time Focus of Attention Estimation in SmartRooms." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2008. doi:10.1109/CVPRW.2008.4563180

Markdown

[Canton-Ferrer et al. "Multimodal Real-Time Focus of Attention Estimation in SmartRooms." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2008.](https://mlanthology.org/cvprw/2008/cantonferrer2008cvprw-multimodal/) doi:10.1109/CVPRW.2008.4563180

BibTeX

@inproceedings{cantonferrer2008cvprw-multimodal,
  title     = {{Multimodal Real-Time Focus of Attention Estimation in SmartRooms}},
  author    = {Canton-Ferrer, Cristian and Segura, Carlos and Pardàs, Montse and Casas, Josep R. and Hernando, Javier},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2008},
  pages     = {1-8},
  doi       = {10.1109/CVPRW.2008.4563180},
  url       = {https://mlanthology.org/cvprw/2008/cantonferrer2008cvprw-multimodal/}
}