Charge Own Job: Saliency mAP and Visual Word Encoder for Image-Level Semantic Segmentation
Abstract
Significant advances in weakly-supervised semantic segmentation (WSSS) methods with image-level labels have been made, but they have several key limitations: incomplete object regions, object boundary mismatch, and co-occurring pixels from non-target objects. To address these issues, we propose a novel joint learning framework, namely S aliency M ap and V isual W ord E ncoder ( SMVWE ), which employs two weak supervisions to generate the high-quality pseudo labels. Specifically, we develop a visual word encoder to encode the localization map into semantic words with a learnable codebook, making the network generate localization maps containing more semantic regions with the encoded fine-grained semantic words. Moreover, to obtain accurate object boundaries and eliminate co-occurring pixels, we design a saliency map selection mechanism with the pseudo-pixel feedback to separate the foreground from the background. During joint learning, we fully utilize the cooperation relationship between semantic word labels and saliency maps to generate high-quality pseudo-labels, thus remarkably improving the segmentation accuracy. Extensive experiments demonstrate that our proposed method better tackles above key challenges of WSSS and obtains the state-of-the-art performance on the PASCAL VOC 2012 segmentation benchmark.
Cite
Text
Guo et al. "Charge Own Job: Saliency mAP and Visual Word Encoder for Image-Level Semantic Segmentation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022. doi:10.1007/978-3-031-26409-2_33Markdown
[Guo et al. "Charge Own Job: Saliency mAP and Visual Word Encoder for Image-Level Semantic Segmentation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022.](https://mlanthology.org/ecmlpkdd/2022/guo2022ecmlpkdd-charge/) doi:10.1007/978-3-031-26409-2_33BibTeX
@inproceedings{guo2022ecmlpkdd-charge,
title = {{Charge Own Job: Saliency mAP and Visual Word Encoder for Image-Level Semantic Segmentation}},
author = {Guo, Yuhui and Liang, Xun and Tang, Hui and Zheng, Xiangping and Wu, Bo and Zhang, Xuan},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2022},
pages = {546-561},
doi = {10.1007/978-3-031-26409-2_33},
url = {https://mlanthology.org/ecmlpkdd/2022/guo2022ecmlpkdd-charge/}
}