SIM: Semantic-Aware Instance Mask Generation for Box-Supervised Instance Segmentation

Abstract

Weakly supervised instance segmentation using only bounding box annotations has recently attracted much research attention. Most of the current efforts leverage low-level image features as extra supervision without explicitly exploiting the high-level semantic information of the objects, which will become ineffective when the foreground objects have similar appearances to the background or other objects nearby. We propose a new box-supervised instance segmentation approach by developing a Semantic-aware Instance Mask (SIM) generation paradigm. Instead of heavily relying on local pair-wise affinities among neighboring pixels, we construct a group of category-wise feature centroids as prototypes to identify foreground objects and assign them semantic-level pseudo labels. Considering that the semantic-aware prototypes cannot distinguish different instances of the same semantics, we propose a self-correction mechanism to rectify the falsely activated regions while enhancing the correct ones. Furthermore, to handle the occlusions between objects, we tailor the Copy-Paste operation for the weakly-supervised instance segmentation task to augment challenging training data. Extensive experimental results demonstrate the superiority of our proposed SIM approach over other state-of-the-art methods. The source code: https://github.com/lslrh/SIM.

Cite

Text

Li et al. "SIM: Semantic-Aware Instance Mask Generation for Box-Supervised Instance Segmentation." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00695

Markdown

[Li et al. "SIM: Semantic-Aware Instance Mask Generation for Box-Supervised Instance Segmentation." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/li2023cvpr-sim/) doi:10.1109/CVPR52729.2023.00695

BibTeX

@inproceedings{li2023cvpr-sim,
  title     = {{SIM: Semantic-Aware Instance Mask Generation for Box-Supervised Instance Segmentation}},
  author    = {Li, Ruihuang and He, Chenhang and Zhang, Yabin and Li, Shuai and Chen, Liyi and Zhang, Lei},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {7193-7203},
  doi       = {10.1109/CVPR52729.2023.00695},
  url       = {https://mlanthology.org/cvpr/2023/li2023cvpr-sim/}
}