A Competitive Algorithm for Agnostic Active Learning

Abstract

For some hypothesis classes and input distributions, \emph{active} agnostic learning needs exponentially fewer samples than passive learning; for other classes and distributions, it offers little to no improvement. The most popular algorithms for agnostic active learning express their performance in terms of a parameter called the disagreement coefficient, but it is known that these algorithms are inefficient on some inputs. We take a different approach to agnostic active learning, getting an algorithm that is \emph{competitive} with the optimal algorithm for any binary hypothesis class $H$ and distribution $\mathcal{D}_X$ over $X$. In particular, if any algorithm can use $m^*$ queries to get $O(\eta)$ error, then our algorithm uses $O(m^* \log H)$ queries to get $O(\eta)$ error. Our algorithm lies in the vein of the splitting-based approach of Dasgupta [2004], which gets a similar result for the realizable ($\eta = 0$) setting. We also show that it is NP-hard to do better than our algorithm's $O(\log H)$ overhead in general.

Cite

Text

Zhou and Ecprice. "A Competitive Algorithm for Agnostic Active Learning." Neural Information Processing Systems, 2023.

Markdown

[Zhou and Ecprice. "A Competitive Algorithm for Agnostic Active Learning." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/zhou2023neurips-competitive/)

BibTeX

@inproceedings{zhou2023neurips-competitive,
  title     = {{A Competitive Algorithm for Agnostic Active Learning}},
  author    = {Zhou, Yihan and Ecprice, },
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/zhou2023neurips-competitive/}
}