Boosting Bottom-up and Top-Down Visual Features for Saliency Estimation

Abstract

Despite significant recent progress, the best available visual saliency models still lag behind human performance in predicting eye fixations in free-viewing of natural scenes. Majority of models are based on low-level visual features and the importance of top-down factors has not yet been fully explored or modeled. Here, we combine low-level features such as orientation, color, intensity, saliency maps of previous best bottom-up models with top-down cognitive visual features (e.g., faces, humans, cars, etc.) and learn a direct mapping from those features to eye fixations using Regression, SVM, and AdaBoost classifiers. By extensive experimenting over three benchmark eye-tracking datasets using three popular evaluation scores, we show that our boosting model outperforms 27 state-of-the-art models and is so far the closest model to the accuracy of human model for fixation prediction. Furthermore, our model successfully detects the most salient object in a scene without sophisticated image processings such as region segmentation.

Cite

Text

Borji. "Boosting Bottom-up and Top-Down Visual Features for Saliency Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6247706

Markdown

[Borji. "Boosting Bottom-up and Top-Down Visual Features for Saliency Estimation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/borji2012cvpr-boosting/) doi:10.1109/CVPR.2012.6247706

BibTeX

@inproceedings{borji2012cvpr-boosting,
  title     = {{Boosting Bottom-up and Top-Down Visual Features for Saliency Estimation}},
  author    = {Borji, Ali},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2012},
  pages     = {438-445},
  doi       = {10.1109/CVPR.2012.6247706},
  url       = {https://mlanthology.org/cvpr/2012/borji2012cvpr-boosting/}
}