Pearls from Pebbles: Improved Confidence Functions for Auto-Labeling
Abstract
Auto-labeling techniques produce labeled data with minimal manual annotations using the representations from self-supervised models and confidence scores. A popular technique, threshold-based auto-labeling (TBAL) trains model using these representations and manual annotations and assigns model's prediction as label to the points where model's confidence score is greater than a certain threshold. However, the model's scores can be overconfident and lead to poor performance. We show that calibration, a common remedy for the overconfidence problem, falls short in tackling this problem for TBAL. Thus, instead of using existing calibration methods, we introduce a framework for optimal confidence functions for TBAL and develop \texttt{Colander}, a method designed to maximize auto-labeling performance. We perform an extensive empirical evaluation of \texttt{Colander} and other confidence functions, using representations from CLIP and text embedding models for image and text data respectively. We find \texttt{Colander} achieves up to 60\% improvement on coverage (the proportion of points labeled by model) over the baselines while maintaining error level below $5\%$ and using the same amount of labeled data.
Cite
Text
Vishwakarma et al. "Pearls from Pebbles: Improved Confidence Functions for Auto-Labeling." NeurIPS 2024 Workshops: SSL, 2024.Markdown
[Vishwakarma et al. "Pearls from Pebbles: Improved Confidence Functions for Auto-Labeling." NeurIPS 2024 Workshops: SSL, 2024.](https://mlanthology.org/neuripsw/2024/vishwakarma2024neuripsw-pearls/)BibTeX
@inproceedings{vishwakarma2024neuripsw-pearls,
title = {{Pearls from Pebbles: Improved Confidence Functions for Auto-Labeling}},
author = {Vishwakarma, Harit and Chen, Yi and Tay, Sui Jiet and Gnvv, Satya Sai Srinath Namburi and Sala, Frederic and Vinayak, Ramya Korlakai},
booktitle = {NeurIPS 2024 Workshops: SSL},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/vishwakarma2024neuripsw-pearls/}
}