Learning Attention mAP from Images

Abstract

While bottom-up and top-down processes have shown effectiveness during predicting attention and eye fixation maps on images, in this paper, inspired by the perceptual organization mechanism before attention selection, we propose to utilize figure-ground maps for the purpose. So as to take both pixel-wise and region-wise interactions into consideration when predicting label probabilities for each pixel, we develop a context-aware model based on multiple segmentation to obtain final results. The MIT attention dataset [14] is applied finally to evaluate both new features and model. Quantitative experiments demonstrate that figure-ground cues are valid in predicting attention selection, and our proposed model produces improvements over baseline method.

Cite

Text

Lu et al. "Learning Attention mAP from Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6247785

Markdown

[Lu et al. "Learning Attention mAP from Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/lu2012cvpr-learning/) doi:10.1109/CVPR.2012.6247785

BibTeX

@inproceedings{lu2012cvpr-learning,
  title     = {{Learning Attention mAP from Images}},
  author    = {Lu, Yao and Zhang, Wei and Jin, Cheng and Xue, Xiangyang},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2012},
  pages     = {1067-1074},
  doi       = {10.1109/CVPR.2012.6247785},
  url       = {https://mlanthology.org/cvpr/2012/lu2012cvpr-learning/}
}