Can OOD Object Detectors Learn from Foundation Models?

Abstract

Out-of-distribution (OOD) object detection is a challenging task due to the absence of open-set OOD data. Inspired by recent advancements in text-to-image generative models, such as Stable Diffusion, we study the potential of generative models trained on large-scale open-set data to synthesize OOD samples, thereby enhancing OOD object detection. We introduce SyncOOD, a simple data curation method that capitalizes on the capabilities of large foundation models to automatically extract meaningful OOD data from text-to-image generative models. This offers the model access to open-world knowledge encapsulated within off-the-shelf foundation models. The synthetic OOD samples are then employed to augment the training of a lightweight, plug-and-play OOD detector, thus effectively optimizing the in-distribution (ID)/OOD decision boundaries. Extensive experiments across multiple benchmarks demonstrate that SyncOOD significantly outperforms existing methods, establishing new state-of-the-art performance with minimal synthetic data usage. The project is available at https://github.com/CVMI-Lab/SyncOOD.

Cite

Text

Liu et al. "Can OOD Object Detectors Learn from Foundation Models?." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73254-6_13

Markdown

[Liu et al. "Can OOD Object Detectors Learn from Foundation Models?." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/liu2024eccv-ood/) doi:10.1007/978-3-031-73254-6_13

BibTeX

@inproceedings{liu2024eccv-ood,
  title     = {{Can OOD Object Detectors Learn from Foundation Models?}},
  author    = {Liu, Jiahui and Wen, Xin and Zhao, Shizhen and Chen, Yingxian and Qi, Xiaojuan},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73254-6_13},
  url       = {https://mlanthology.org/eccv/2024/liu2024eccv-ood/}
}