Predicting Eye Fixations Using Convolutional Neural Networks
Abstract
It is believed that eye movements in free-viewing of natural scenes are directed by both bottom-up visual saliency and top-down visual factors. In this paper, we propose a novel computational framework to simultaneously learn these two types of visual features from raw image data using a multiresolution convolutional neural network (Mr-CNN) for predicting eye fixations. The Mr-CNN is directly trained from image regions centered on fixation and non-fixation locations over multiple resolutions, using raw image pixels as inputs and eye fixation attributes as labels. Diverse top-down visual features can be learned in higher layers. Meanwhile bottom-up visual saliency can also be inferred via combining information over multiple resolutions. Finally, optimal integration of bottom-up and top-down cues can be learned in the last logistic regression layer to predict eye fixations. The proposed approach achieves state-of-the-art results over four publically available benchmark datasets, demonstrating the superiority of our work.
Cite
Text
Liu et al. "Predicting Eye Fixations Using Convolutional Neural Networks." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7298633Markdown
[Liu et al. "Predicting Eye Fixations Using Convolutional Neural Networks." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/liu2015cvpr-predicting/) doi:10.1109/CVPR.2015.7298633BibTeX
@inproceedings{liu2015cvpr-predicting,
title = {{Predicting Eye Fixations Using Convolutional Neural Networks}},
author = {Liu, Nian and Han, Junwei and Zhang, Dingwen and Wen, Shifeng and Liu, Tianming},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2015},
doi = {10.1109/CVPR.2015.7298633},
url = {https://mlanthology.org/cvpr/2015/liu2015cvpr-predicting/}
}