A Comparative Survey: Benchmarking for Pool-Based Active Learning

Abstract

Active learning (AL) is a subfield of machine learning (ML) in which a learning algorithm aims to achieve good accuracy with fewer training samples by interactively querying the oracles to label new data points. Pool-based AL is well-motivated in many ML tasks, where unlabeled data is abundant, but their labels are hard or costly to obtain. Although many pool-based AL methods have been developed, some important questions remain unanswered such as how to: 1) determine the current state-of-the-art technique; 2) evaluate the relative benefit of new methods for various properties of the dataset; 3) understand what specific problems merit greater attention; and 4) measure the progress of the field over time. In this paper, we survey and compare various AL strategies used in both recently proposed and classic highly-cited methods. We propose to benchmark pool-based AL methods with a variety of datasets and quantitative metric, and draw insights from the comparative empirical results.

Cite

Text

Zhan et al. "A Comparative Survey: Benchmarking for Pool-Based Active Learning." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/634

Markdown

[Zhan et al. "A Comparative Survey: Benchmarking for Pool-Based Active Learning." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/zhan2021ijcai-comparative/) doi:10.24963/IJCAI.2021/634

BibTeX

@inproceedings{zhan2021ijcai-comparative,
  title     = {{A Comparative Survey: Benchmarking for Pool-Based Active Learning}},
  author    = {Zhan, Xueying and Liu, Huan and Li, Qing and Chan, Antoni B.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {4679-4686},
  doi       = {10.24963/IJCAI.2021/634},
  url       = {https://mlanthology.org/ijcai/2021/zhan2021ijcai-comparative/}
}