Weakly Supervised Semantic Segmentation for Social Images

Abstract

Image semantic segmentation is the task of partitioning image into several regions based on semantic concepts. In this paper, we learn a weakly supervised semantic segmentation model from social images whose labels are not pixel-level but image-level; furthermore, these labels might be noisy. We present a joint conditional random field model leveraging various contexts to address this issue. More specifically, we extract global and local features in multiple scales by convolutional neural network and topic model. Inter-label correlations are captured by visual contextual cues and label co-occurrence statistics. The label consistency between image-level and pixel-level is finally achieved by iterative refinement. Experimental results on two real-world image datasets PASCAL VOC2007 and SIFT-Flow demonstrate that the proposed approach outperforms state-of-the-art weakly supervised methods and even achieves accuracy comparable with fully supervised methods.

Cite

Text

Zhang et al. "Weakly Supervised Semantic Segmentation for Social Images." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7298888

Markdown

[Zhang et al. "Weakly Supervised Semantic Segmentation for Social Images." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/zhang2015cvpr-weakly/) doi:10.1109/CVPR.2015.7298888

BibTeX

@inproceedings{zhang2015cvpr-weakly,
  title     = {{Weakly Supervised Semantic Segmentation for Social Images}},
  author    = {Zhang, Wei and Zeng, Sheng and Wang, Dequan and Xue, Xiangyang},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2015},
  doi       = {10.1109/CVPR.2015.7298888},
  url       = {https://mlanthology.org/cvpr/2015/zhang2015cvpr-weakly/}
}