Plain-Det: A Plain Multi-Dataset Object Detector

Abstract

Recent advancements in large-scale foundational models have sparked widespread interest in training highly proficient large vision models. A common consensus revolves around the necessity of aggregating extensive, high-quality annotated data. However, given the inherent challenges in annotating dense tasks in computer vision, such as object detection and segmentation, a practical strategy is to combine and leverage all available data for training purposes. In this work, we propose Plain-Det, which offers flexibility to accommodate new datasets, robustness in performance across diverse datasets, training efficiency, and compatibility with various detection architectures. We utilize Def-DETR, with the assistance of Plain-Det, to achieve a mAP of 51.9 on COCO, matching the current state-of-the-art detectors. We conduct extensive experiments on 13 downstream datasets and Plain-Det demonstrates strong generalization capability. Code is release at https://github.com/ChengShiest/Plain-Det.

Cite

Text

Shi et al. "Plain-Det: A Plain Multi-Dataset Object Detector." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72652-1_13

Markdown

[Shi et al. "Plain-Det: A Plain Multi-Dataset Object Detector." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/shi2024eccv-plaindet/) doi:10.1007/978-3-031-72652-1_13

BibTeX

@inproceedings{shi2024eccv-plaindet,
  title     = {{Plain-Det: A Plain Multi-Dataset Object Detector}},
  author    = {Shi, Cheng and Zhu, Yuchen and Yang, Sibei},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72652-1_13},
  url       = {https://mlanthology.org/eccv/2024/shi2024eccv-plaindet/}
}