Object Detection with Self-Supervised Scene Adaptation

Abstract

This paper proposes a novel method to improve the performance of a trained object detector on scenes with fixed camera perspectives based on self-supervised adaptation. Given a specific scene, the trained detector is adapted using pseudo-ground truth labels generated by the detector itself and an object tracker in a cross-teaching manner. When the camera perspective is fixed, our method can utilize the background equivariance by proposing artifact-free object mixup as a means of data augmentation, and utilize accurate background extraction as an additional input modality. We also introduce a large-scale and diverse dataset for the development and evaluation of scene-adaptive object detection. Experiments on this dataset show that our method can improve the average precision of the original detector, outperforming the previous state-of-the-art self-supervised domain adaptive object detection methods by a large margin. Our dataset and code are published at https://github.com/cvlab-stonybrook/scenes100.

Cite

Text

Zhang and Hoai. "Object Detection with Self-Supervised Scene Adaptation." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.02068

Markdown

[Zhang and Hoai. "Object Detection with Self-Supervised Scene Adaptation." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/zhang2023cvpr-object/) doi:10.1109/CVPR52729.2023.02068

BibTeX

@inproceedings{zhang2023cvpr-object,
  title     = {{Object Detection with Self-Supervised Scene Adaptation}},
  author    = {Zhang, Zekun and Hoai, Minh},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {21589-21599},
  doi       = {10.1109/CVPR52729.2023.02068},
  url       = {https://mlanthology.org/cvpr/2023/zhang2023cvpr-object/}
}