Scene Context-Aware Salient Object Detection

Abstract

Salient object detection identifies objects in an image that grab visual attention. Although contextual features are considered in recent literature, they often fail in real-world complex scenarios. We observe that this is mainly due to two issues: First, most existing datasets consist of simple foregrounds and backgrounds that hardly represent real-life scenarios. Second, current methods only learn contextual features of salient objects, which are insufficient to model high-level semantics for saliency reasoning in complex scenes. To address these problems, we first construct a new large-scale dataset with complex scenes in this paper. We then propose a context-aware learning approach to explicitly exploit the semantic scene contexts. Specifically, two modules are proposed to achieve the goal: 1) a Semantic Scene Context Refinement module to enhance contextual features learned from salient objects with scene context, and 2) a Contextual Instance Transformer to learn contextual relations between objects and scene context. To our knowledge, such high-level semantic contextual information of image scenes is under-explored for saliency detection in the literature. Extensive experiments demonstrate that the proposed approach outperforms state-of-the-art techniques in complex scenarios for saliency detection, and transfers well to other existing datasets. The code and dataset are available at https://github.com/SirisAvishek/Scene_Context_Aware_Saliency.

Cite

Text

Siris et al. "Scene Context-Aware Salient Object Detection." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00412

Markdown

[Siris et al. "Scene Context-Aware Salient Object Detection." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/siris2021iccv-scene/) doi:10.1109/ICCV48922.2021.00412

BibTeX

@inproceedings{siris2021iccv-scene,
  title     = {{Scene Context-Aware Salient Object Detection}},
  author    = {Siris, Avishek and Jiao, Jianbo and Tam, Gary K.L. and Xie, Xianghua and Lau, Rynson W.H.},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {4156-4166},
  doi       = {10.1109/ICCV48922.2021.00412},
  url       = {https://mlanthology.org/iccv/2021/siris2021iccv-scene/}
}