Understanding and Predicting Importance in Images

Abstract

What do people care about in an image? To drive computational visual recognition toward more human-centric outputs, we need a better understanding of how people perceive and judge the importance of content in images. In this paper, we explore how a number of factors relate to human perception of importance. Proposed factors fall into 3 broad types: 1) factors related to composition, e.g. size, location, 2) factors related to semantics, e.g. category of object or scene, and 3) contextual factors related to the likelihood of attribute-object, or object-scene pairs. We explore these factors using what people describe as a proxy for importance. Finally, we build models to predict what will be described about an image given either known image content, or image content estimated automatically by recognition systems.

Cite

Text

Berg et al. "Understanding and Predicting Importance in Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6248100

Markdown

[Berg et al. "Understanding and Predicting Importance in Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/berg2012cvpr-understanding/) doi:10.1109/CVPR.2012.6248100

BibTeX

@inproceedings{berg2012cvpr-understanding,
  title     = {{Understanding and Predicting Importance in Images}},
  author    = {Berg, Alexander C. and Berg, Tamara L. and Iii, Hal Daumé and Dodge, Jesse and Goyal, Amit and Han, Xufeng and Mensch, Alyssa C. and Mitchell, Margaret and Sood, Aneesh and Stratos, Karl and Yamaguchi, Kota},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2012},
  pages     = {3562-3569},
  doi       = {10.1109/CVPR.2012.6248100},
  url       = {https://mlanthology.org/cvpr/2012/berg2012cvpr-understanding/}
}