Contrastive Mean Teacher for Domain Adaptive Object Detectors
Abstract
Object detectors often suffer from the domain gap between training (source domain) and real-world applications (target domain). Mean-teacher self-training is a powerful paradigm in unsupervised domain adaptation for object detection, but it struggles with low-quality pseudo-labels. In this work, we identify the intriguing alignment and synergy between mean-teacher self-training and contrastive learning. Motivated by this, we propose Contrastive Mean Teacher (CMT) -- a unified, general-purpose framework with the two paradigms naturally integrated to maximize beneficial learning signals. Instead of using pseudo-labels solely for final predictions, our strategy extracts object-level features using pseudo-labels and optimizes them via contrastive learning, without requiring labels in the target domain. When combined with recent mean-teacher self-training methods, CMT leads to new state-of-the-art target-domain performance: 51.9% mAP on Foggy Cityscapes, outperforming the previously best by 2.1% mAP. Notably, CMT can stabilize performance and provide more significant gains as pseudo-label noise increases.
Cite
Text
Cao et al. "Contrastive Mean Teacher for Domain Adaptive Object Detectors." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.02283Markdown
[Cao et al. "Contrastive Mean Teacher for Domain Adaptive Object Detectors." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/cao2023cvpr-contrastive/) doi:10.1109/CVPR52729.2023.02283BibTeX
@inproceedings{cao2023cvpr-contrastive,
title = {{Contrastive Mean Teacher for Domain Adaptive Object Detectors}},
author = {Cao, Shengcao and Joshi, Dhiraj and Gui, Liang-Yan and Wang, Yu-Xiong},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2023},
pages = {23839-23848},
doi = {10.1109/CVPR52729.2023.02283},
url = {https://mlanthology.org/cvpr/2023/cao2023cvpr-contrastive/}
}