DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Abstract

We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision. Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. The teacher is a structured energy model incorporating a pairwise potential and a cross-image potential to model the pairwise pixel relationships both within and across the boxes. Minimizing the teacher energy simultaneously yields refined object masks and dense correspondences between intra-class objects, which are taken as pseudo-labels to supervise the task network and provide positive/negative correspondence pairs for dense contrastive learning. We show a symbiotic relationship where the two tasks mutually benefit from each other. Our best model achieves 37.9% AP on COCO instance segmentation, surpassing prior weakly supervised methods and is competitive to supervised methods. We also obtain state of the art weakly supervised results on PASCAL VOC12 and PF-PASCAL with real-time inference.

Cite

Text

Lan et al. "DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00339

Markdown

[Lan et al. "DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/lan2021iccv-discobox/) doi:10.1109/ICCV48922.2021.00339

BibTeX

@inproceedings{lan2021iccv-discobox,
  title     = {{DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision}},
  author    = {Lan, Shiyi and Yu, Zhiding and Choy, Christopher and Radhakrishnan, Subhashree and Liu, Guilin and Zhu, Yuke and Davis, Larry S. and Anandkumar, Anima},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {3406-3416},
  doi       = {10.1109/ICCV48922.2021.00339},
  url       = {https://mlanthology.org/iccv/2021/lan2021iccv-discobox/}
}