Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection

Abstract

In this paper we address the limitations of the DETR-based semi-supervised object detection (SSOD) framework particularly focusing on the challenges posed by the quality of object queries. In DETR-based SSOD the one-to-one assignment strategy provides inaccurate pseudo-labels while the one-to-many assignments strategy leads to overlapping predictions. These issues compromise training efficiency and degrade model performance especially in detecting small or occluded objects. We introduce Sparse Semi-DETR a novel transformer-based end-to-end semi-supervised object detection solution to overcome these challenges. Sparse Semi-DETR incorporates a Query Refinement Module to enhance the quality of object queries significantly improving detection capabilities for small and partially obscured objects. Additionally we integrate a Reliable Pseudo-Label Filtering Module that selectively filters high-quality pseudo-labels thereby enhancing detection accuracy and consistency. On the MS-COCO and Pascal VOC object detection benchmarks Sparse Semi-DETR achieves a significant improvement over current state-of-the-art methods that highlight Sparse Semi-DETR's effectiveness in semi-supervised object detection particularly in challenging scenarios involving small or partially obscured objects.

Cite

Text

Shehzadi et al. "Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00558

Markdown

[Shehzadi et al. "Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/shehzadi2024cvpr-sparse/) doi:10.1109/CVPR52733.2024.00558

BibTeX

@inproceedings{shehzadi2024cvpr-sparse,
  title     = {{Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection}},
  author    = {Shehzadi, Tahira and Hashmi, Khurram Azeem and Stricker, Didier and Afzal, Muhammad Zeshan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {5840-5850},
  doi       = {10.1109/CVPR52733.2024.00558},
  url       = {https://mlanthology.org/cvpr/2024/shehzadi2024cvpr-sparse/}
}