Learning to Count Without Annotations

Abstract

While recent supervised methods for reference-based object counting continue to improve the performance on benchmark datasets they have to rely on small datasets due to the cost associated with manually annotating dozens of objects in images. We propose UnCounTR a model that can learn this task without requiring any manual annotations. To this end we construct "Self-Collages" images with various pasted objects as training samples that provide a rich learning signal covering arbitrary object types and counts. Our method builds on existing unsupervised representations and segmentation techniques to successfully demonstrate for the first time the ability of reference-based counting without manual supervision. Our experiments show that our method not only outperforms simple baselines and generic models such as FasterRCNN and DETR but also matches the performance of supervised counting models in some domains.

Cite

Text

Knobel et al. "Learning to Count Without Annotations." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02163

Markdown

[Knobel et al. "Learning to Count Without Annotations." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/knobel2024cvpr-learning/) doi:10.1109/CVPR52733.2024.02163

BibTeX

@inproceedings{knobel2024cvpr-learning,
  title     = {{Learning to Count Without Annotations}},
  author    = {Knobel, Lukas and Han, Tengda and Asano, Yuki M.},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {22924-22934},
  doi       = {10.1109/CVPR52733.2024.02163},
  url       = {https://mlanthology.org/cvpr/2024/knobel2024cvpr-learning/}
}