Learning Attributes from Human Gaze
Abstract
While semantic visual attributes have been shown useful for a variety of tasks, many attributes are difficult to model computationally. One of the reasons for this difficulty is that it is not clear where in an image the attribute lives. We propose to tackle this problem by involving humans more directly in the process of learning an attribute model. We ask humans to examine a set of images to determine if a given attribute is present in them, and we record where they looked. We create gaze maps for each attribute, and use these gaze maps to improve attribute prediction models. For test images we do not have gaze maps available, so we predict them based on models learned from collected gaze maps for each attribute of interest. Compared to six baselines, we improve prediction accuracies on attributes of faces and shoes, and we show how our method might be adapted for scene images. We demonstrate additional uses of our gaze maps for visualization of attribute models and learning "schools of thought" between users in terms of their understanding of the attribute.
Cite
Text
Murrugarra-Llerena and Kovashka. "Learning Attributes from Human Gaze." IEEE/CVF Winter Conference on Applications of Computer Vision, 2017. doi:10.1109/WACV.2017.63Markdown
[Murrugarra-Llerena and Kovashka. "Learning Attributes from Human Gaze." IEEE/CVF Winter Conference on Applications of Computer Vision, 2017.](https://mlanthology.org/wacv/2017/murrugarrallerena2017wacv-learning/) doi:10.1109/WACV.2017.63BibTeX
@inproceedings{murrugarrallerena2017wacv-learning,
title = {{Learning Attributes from Human Gaze}},
author = {Murrugarra-Llerena, Nils and Kovashka, Adriana},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2017},
pages = {510-519},
doi = {10.1109/WACV.2017.63},
url = {https://mlanthology.org/wacv/2017/murrugarrallerena2017wacv-learning/}
}