Discovering Objects of Joint Attention via First-Person Sensing
Abstract
The goal of this work is to discover objects of joint attention, i.e., objects being viewed by multiple people using head-mounted cameras and eye trackers. Such objects of joint attention are expected to act as an important cue for understanding social interactions in everyday scenes. To this end, we develop a commonality-clustering method tailored to first-person videos combined with points-of-gaze sources. The proposed method uses multiscale spatiotemporal tubes around points of gaze as a candidate of objects, making it possible to deal with various sizes of objects observed in the first-person videos. We also introduce a new dataset of multiple pairs of first-person videos and points-of-gaze data. Our experimental results show that our approach can outperform several state-of-the-art commonality-clustering methods.
Cite
Text
Kera et al. "Discovering Objects of Joint Attention via First-Person Sensing." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2016. doi:10.1109/CVPRW.2016.52Markdown
[Kera et al. "Discovering Objects of Joint Attention via First-Person Sensing." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2016.](https://mlanthology.org/cvprw/2016/kera2016cvprw-discovering/) doi:10.1109/CVPRW.2016.52BibTeX
@inproceedings{kera2016cvprw-discovering,
title = {{Discovering Objects of Joint Attention via First-Person Sensing}},
author = {Kera, Hiroshi and Yonetani, Ryo and Higuchi, Keita and Sato, Yoichi},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2016},
pages = {361-369},
doi = {10.1109/CVPRW.2016.52},
url = {https://mlanthology.org/cvprw/2016/kera2016cvprw-discovering/}
}