DAP: Detection-Aware Pre-Training with Weak Supervision

Abstract

This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks. In contrast to the widely used image classification-based pre-training (e.g., on ImageNet), which does not include any location-related training tasks, we transform a classification dataset into a detection dataset through a weakly supervised object localization method based on Class Activation Maps to directly pre-train a detector, making the pre-trained model location-aware and capable of predicting bounding boxes. We show that DAP can outperform the traditional classification pre-training in terms of both sample efficiency and convergence speed in downstream detection tasks including VOC and COCO. In particular, DAP boosts the detection accuracy by a large margin when the number of examples in the downstream task is small.

Cite

Text

Zhong et al. "DAP: Detection-Aware Pre-Training with Weak Supervision." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.00451

Markdown

[Zhong et al. "DAP: Detection-Aware Pre-Training with Weak Supervision." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/zhong2021cvpr-dap/) doi:10.1109/CVPR46437.2021.00451

BibTeX

@inproceedings{zhong2021cvpr-dap,
  title     = {{DAP: Detection-Aware Pre-Training with Weak Supervision}},
  author    = {Zhong, Yuanyi and Wang, Jianfeng and Wang, Lijuan and Peng, Jian and Wang, Yu-Xiong and Zhang, Lei},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {4537-4546},
  doi       = {10.1109/CVPR46437.2021.00451},
  url       = {https://mlanthology.org/cvpr/2021/zhong2021cvpr-dap/}
}