Omni-DETR: Omni-Supervised Object Detection with Transformers

Abstract

We consider the problem of omni-supervised object detection, which can use unlabeled, fully labeled and weakly labeled annotations, such as image tags, counts, points, etc., for object detection. This is enabled by a unified architecture, Omni-DETR, based on the recent progress on student-teacher framework and end-to-end transformer based object detection. Under this unified architecture, different types of weak labels can be leveraged to generate accurate pseudo labels, by a bipartite matching based filtering mechanism, for the model to learn. In the experiments, Omni-DETR has achieved state-of-the-art results on multiple datasets and settings. And we have found that weak annotations can help to improve detection performance and a mixture of them can achieve a better trade-off between annotation cost and accuracy than the standard complete annotation. These findings could encourage larger object detection datasets with mixture annotations. The code is available at https://github.com/amazon-research/omni-detr.

Cite

Text

Wang et al. "Omni-DETR: Omni-Supervised Object Detection with Transformers." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00915

Markdown

[Wang et al. "Omni-DETR: Omni-Supervised Object Detection with Transformers." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/wang2022cvpr-omnidetr/) doi:10.1109/CVPR52688.2022.00915

BibTeX

@inproceedings{wang2022cvpr-omnidetr,
  title     = {{Omni-DETR: Omni-Supervised Object Detection with Transformers}},
  author    = {Wang, Pei and Cai, Zhaowei and Yang, Hao and Swaminathan, Gurumurthy and Vasconcelos, Nuno and Schiele, Bernt and Soatto, Stefano},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {9367-9376},
  doi       = {10.1109/CVPR52688.2022.00915},
  url       = {https://mlanthology.org/cvpr/2022/wang2022cvpr-omnidetr/}
}