A Simple-but-Effective Baseline for Training-Free Class-Agnostic Counting
Abstract
Class-Agnostic Counting (CAC) seeks to accurately count objects in a given image with only a few reference examples. While previous methods achieving this relied on additional training recent efforts have shown that it's possible to accomplish this without training by utilizing pre-existing foundation models particularly the Segment Anything Model (SAM) for counting via instance-level segmentation. Although promising current training-free methods still lag behind their training-based counterparts in terms of performance. In this research we present a straightforward training-free solution that effectively bridges this performance gap serving as a strong baseline. The primary contribution of our work lies in the discovery of four key technologies that can enhance performance. Specifically we suggest employing a superpixel algorithm to generate more precise initial point prompts utilizing an image encoder with richer semantic knowledge to replace the SAM encoder for representing candidate objects and adopting a multiscale mechanism and a transductive prototype scheme to update the representation of reference examples. By combining these four technologies our approach achieves significant improvements over existing training-free methods and delivers performance on par with training-based ones.
Cite
Text
Lin et al. "A Simple-but-Effective Baseline for Training-Free Class-Agnostic Counting." Winter Conference on Applications of Computer Vision, 2025.Markdown
[Lin et al. "A Simple-but-Effective Baseline for Training-Free Class-Agnostic Counting." Winter Conference on Applications of Computer Vision, 2025.](https://mlanthology.org/wacv/2025/lin2025wacv-simplebuteffective/)BibTeX
@inproceedings{lin2025wacv-simplebuteffective,
title = {{A Simple-but-Effective Baseline for Training-Free Class-Agnostic Counting}},
author = {Lin, Yuhao and Xu, Haiming and Liu, Lingqiao and Shi, Javen Qinfeng},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2025},
pages = {8144-8153},
url = {https://mlanthology.org/wacv/2025/lin2025wacv-simplebuteffective/}
}