Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection

Abstract

Recent progress in weakly supervised object detection is featured by a combination of multiple instance detection networks (MIDN) and ordinal online refinement. However, with only image-level annotation, MIDN inevitably assigns high scores to some unexpected region proposals when generating pseudo labels. These inaccurate high-scoring region proposals will mislead the training of subsequent refinement modules and thus hamper the detection performance. In this work, we explore how to ameliorate the quality of pseudo-labeling in MIDN. Formally, we devise Cyclic-Bootstrap Labeling (CBL), a novel weakly supervised object detection pipeline, which optimizes MIDN with rank information from a reliable teacher network. Specifically, we obtain this teacher network by introducing a weighted exponential moving average strategy to take advantage of various refinement modules. A novel class-specific ranking distillation algorithm is proposed to leverage the output of weighted ensembled teacher network for distilling MIDN with rank information. As a result, MIDN is guided to assign higher scores to accurate proposals, which further benefits final detection. Extensive experiments on the prevalent PASCAL VOC 2007 & 2012 and COCO datasets demonstrate the superior performance of our CBL framework.

Cite

Text

Yin et al. "Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00645

Markdown

[Yin et al. "Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/yin2023iccv-cyclicbootstrap/) doi:10.1109/ICCV51070.2023.00645

BibTeX

@inproceedings{yin2023iccv-cyclicbootstrap,
  title     = {{Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection}},
  author    = {Yin, Yufei and Deng, Jiajun and Zhou, Wengang and Li, Li and Li, Houqiang},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {7008-7018},
  doi       = {10.1109/ICCV51070.2023.00645},
  url       = {https://mlanthology.org/iccv/2023/yin2023iccv-cyclicbootstrap/}
}