LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning
Abstract
Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive. To mitigate this cost, machine learning methods, such as transfer learning, semi-supervised learning and active learning, aim to be label-efficient: achieving high predictive performance from relatively few labeled examples. While obtaining the best label-efficiency in practice often requires combinations of these techniques, existing benchmark and evaluation frameworks do not capture a concerted combination of all such techniques. This paper addresses this deficiency by introducing LabelBench, a new computationally-efficient framework for joint evaluation of multiple label-efficient learning techniques. As an application of LabelBench, we introduce a novel benchmark of state-of-the-art active learning methods in combination with semi-supervised learning for fine-tuning pretrained vision transformers. Our benchmark demonstrates significantly better label-efficiencies than previously reported in active learning. LabelBench’s modular codebase is open-sourced for the broader community to contribute label-efficient learning methods and benchmarks. The repository can be found at: https://github.com/EfficientTraining/LabelBench.
Cite
Text
Zhang et al. "LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning." Data-centric Machine Learning Research, 2024.Markdown
[Zhang et al. "LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning." Data-centric Machine Learning Research, 2024.](https://mlanthology.org/dmlr/2024/zhang2024dmlr-labelbench/)BibTeX
@article{zhang2024dmlr-labelbench,
title = {{LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning}},
author = {Zhang, Jifan and Chen, Yifang and Canal, Gregory and Das, Arnav Mohanty and Bhatt, Gantavya and Mussmann, Stephen and Zhu, Yinglun and Bilmes, Jeff and Du, Simon Shaolei and Jamieson, Kevin and Nowak, Robert D},
journal = {Data-centric Machine Learning Research},
year = {2024},
pages = {1-43},
volume = {1},
url = {https://mlanthology.org/dmlr/2024/zhang2024dmlr-labelbench/}
}