Benchmarking Gaze Prediction for Categorical Visual Search
Abstract
The prediction of human shifts of attention is a widelystudied question in both behavioral and computer vision, especially in the context of a free viewing task. However, search behavior, where the fixation scanpaths are highly dependent on the viewer’s goals, has received far less attention, even though visual search constitutes much of a person’s everyday behavior. One reason for this is the absence of real-world image datasets on which search models can be trained. In this paper we present a carefully created dataset for two target categories, microwaves and clocks, curated from the COCO2014 dataset. A total of 2183 images were presented to multiple participants, who were tasked to search for one of the two categories. This yields a total of 16184 validated fixations used for training, making our microwave-clock dataset currently one of the largest datasets of eye fixations in categorical search. We also present a 40-image testing dataset, where images depict both a microwave and a clock target. Distinct fixation patterns emerged depending on whether participants searched for a microwave (n=30) or a clock (n=30) in the same images, meaning that models need to predict different search scanpaths from the same pixel inputs. We report the results of several state-of-the-art deep network models that were trained and evaluated on these datasets. Collectively, these datasets and our protocol for evaluation provide what we hope will be a useful test-bed for the development of new methods for predicting category-specific visual search behavior.
Cite
Text
Zelinsky et al. "Benchmarking Gaze Prediction for Categorical Visual Search." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019. doi:10.1109/CVPRW.2019.00111Markdown
[Zelinsky et al. "Benchmarking Gaze Prediction for Categorical Visual Search." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.](https://mlanthology.org/cvprw/2019/zelinsky2019cvprw-benchmarking/) doi:10.1109/CVPRW.2019.00111BibTeX
@inproceedings{zelinsky2019cvprw-benchmarking,
title = {{Benchmarking Gaze Prediction for Categorical Visual Search}},
author = {Zelinsky, Gregory J. and Yang, Zhibo and Huang, Lihan and Chen, Yupei and Ahn, Seoyoung and Wei, Zijun and Adeli, Hossein and Samaras, Dimitris and Hoai, Minh},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2019},
pages = {828-836},
doi = {10.1109/CVPRW.2019.00111},
url = {https://mlanthology.org/cvprw/2019/zelinsky2019cvprw-benchmarking/}
}