Learning RGB-D Salient Object Detection Using Background Enclosure, Depth Contrast, and Top-Down Features

Abstract

In human visual saliency, top-down and bottom-up information are combined as a basis of visual attention. Recently, deep Convolutional Neural Networks (CNN) have demonstrated strong performance on RGB salient object detection, providing an effective mechanism for combining top-down semantic information with low level features. Although depth information has been shown to be important for human perception of salient objects, the use of top-down information and the exploration of CNNs for RGB-D salient object detection remains limited. Here we propose a novel deep CNN architecture for RGB-D salient object detection that utilizes both top-down and bottom-up cues. In order to produce such an architecture, we present novel depth features that capture the ideas of background enclosure, depth contrast and histogram distance in a manner that is suitable for a learned approach. We show improved results compared to state-of-the-art RGB-D salient object detection methods. We also show that the low-level and mid-level depth features both contribute to improvements in results. In particular, the F-Score of our method is 0.848 on RGBD1000, which is 10.7% better than the current best.

Cite

Text

Shigematsu et al. "Learning RGB-D Salient Object Detection Using Background Enclosure, Depth Contrast, and Top-Down Features." IEEE/CVF International Conference on Computer Vision Workshops, 2017. doi:10.1109/ICCVW.2017.323

Markdown

[Shigematsu et al. "Learning RGB-D Salient Object Detection Using Background Enclosure, Depth Contrast, and Top-Down Features." IEEE/CVF International Conference on Computer Vision Workshops, 2017.](https://mlanthology.org/iccvw/2017/shigematsu2017iccvw-learning/) doi:10.1109/ICCVW.2017.323

BibTeX

@inproceedings{shigematsu2017iccvw-learning,
  title     = {{Learning RGB-D Salient Object Detection Using Background Enclosure, Depth Contrast, and Top-Down Features}},
  author    = {Shigematsu, Riku and Feng, David and You, Shaodi and Barnes, Nick},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2017},
  pages     = {2749-2757},
  doi       = {10.1109/ICCVW.2017.323},
  url       = {https://mlanthology.org/iccvw/2017/shigematsu2017iccvw-learning/}
}