Scanpath Prediction for Visual Attention Using IOR-ROI LSTM
Abstract
Predicting scanpath when a certain stimulus is presented plays an important role in modeling visual attention and search. This paper presents a model that integrates convolutional neural network and long short-term memory (LSTM) to generate realistic scanpaths. The core part of the proposed model is a dual LSTM unit, i.e., an inhibition of return LSTM (IOR-LSTM) and a region of interest LSTM (ROI-LSTM), capturing IOR dynamics and gaze shift behavior simultaneously. IOR-LSTM simulates the visual working memory to adaptively integrate and forget scene information. ROI-LSTM is responsible for predicting the next ROI given the inhibited image features. Experimental results indicate that the proposed architecture can achieve superior performance in predicting scanpaths.
Cite
Text
Chen and Sun. "Scanpath Prediction for Visual Attention Using IOR-ROI LSTM." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/89Markdown
[Chen and Sun. "Scanpath Prediction for Visual Attention Using IOR-ROI LSTM." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/chen2018ijcai-scanpath/) doi:10.24963/IJCAI.2018/89BibTeX
@inproceedings{chen2018ijcai-scanpath,
title = {{Scanpath Prediction for Visual Attention Using IOR-ROI LSTM}},
author = {Chen, Zhenzhong and Sun, Wanjie},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2018},
pages = {642-648},
doi = {10.24963/IJCAI.2018/89},
url = {https://mlanthology.org/ijcai/2018/chen2018ijcai-scanpath/}
}