Fine-Tuning Vision Classifiers on a Budget

Abstract

Fine-tuning modern computer vision models requires accurately labeled data for which the ground truth may not exist, but a set of multiple labels can be obtained from labelers of variable accuracy. We tie label quality to confidence derived from historical labeler accuracy using a simple naive-Bayes model. Imputing true labels in this way allows us to label more data on a fixed budget without compromising label or fine-tuning quality. We present experiments on a dataset of industrial images that demonstrates that our method, called Ground Truth Extension (GTX), enables fine-tuning ML models using fewer human labels.

Cite

Text

Kumar et al. "Fine-Tuning Vision Classifiers on a Budget." NeurIPS 2024 Workshops: FITML, 2024.

Markdown

[Kumar et al. "Fine-Tuning Vision Classifiers on a Budget." NeurIPS 2024 Workshops: FITML, 2024.](https://mlanthology.org/neuripsw/2024/kumar2024neuripsw-finetuning/)

BibTeX

@inproceedings{kumar2024neuripsw-finetuning,
  title     = {{Fine-Tuning Vision Classifiers on a Budget}},
  author    = {Kumar, Sunil and Sandler, Ted and Varshavskaya, Paulina},
  booktitle = {NeurIPS 2024 Workshops: FITML},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/kumar2024neuripsw-finetuning/}
}