Learning Dynamic GMM for Attention Distribution on Single-Face Videos

Abstract

The past decade has witnessed the popularity of video conferencing, such as FaceTime and Skype. In video conferencing, almost every frame has a human face. Hence, it is necessary to predict attention on face videos by saliency detection, as saliency can be used as a guidance of regionof- interest (ROI) for the content-based applications. To this end, this paper proposes a novel approach for saliency detection in single-face videos. From the data-driven perspective, we first establish an eye tracking database which contains fixations of 70 single-face videos viewed by 40 subjects. Through analysis on our database, we investigate that most attention is attracted by face in videos, and that attention distribution within a face varies with regard to face size and mouth movement. Inspired by the previous work which applies Gaussian mixture model (GMM) for face saliency detection in still images, we propose to model visual attention on face region for videos by dynamic GMM (DGMM), the variation of which relies on face size, mouth movement and facial landmarks. Then, we develop a long shortterm memory (LSTM) neural network in estimating DGMM for saliency detection of single-face videos, so called LSTM-DGMM. Finally, the experimental results show that our approach outperforms other state-of-the-art approaches in saliency detection of single-face videos.

Cite

Text

Ren et al. "Learning Dynamic GMM for Attention Distribution on Single-Face Videos." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2017. doi:10.1109/CVPRW.2017.208

Markdown

[Ren et al. "Learning Dynamic GMM for Attention Distribution on Single-Face Videos." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2017.](https://mlanthology.org/cvprw/2017/ren2017cvprw-learning/) doi:10.1109/CVPRW.2017.208

BibTeX

@inproceedings{ren2017cvprw-learning,
  title     = {{Learning Dynamic GMM for Attention Distribution on Single-Face Videos}},
  author    = {Ren, Yun and Wang, Zulin and Xu, Mai and Dong, Haoyu and Li, Shengxi},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2017},
  pages     = {1632-1641},
  doi       = {10.1109/CVPRW.2017.208},
  url       = {https://mlanthology.org/cvprw/2017/ren2017cvprw-learning/}
}