Corruption Robust Active Learning
Abstract
We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions. In this setting, every time before the learner observes a sample, the adversary decides whether to corrupt the label ornot. First, we show that, in a benign corruption setting (which includes the misspecification setting as a special case),with a slight enlargement on the hypothesis elimination threshold, the classical RobustCAL framework can (surprisingly) achieve nearly the same label complexity guarantee as in the non-corrupted setting. However, this algorithm can fail in the general corruption setting. To resolve this drawback, we propose a new algorithm which is provably correct without any assumptions on the presence of corruptions. Furthermore, this algorithm enjoys the minimax label complexity in the non-corrupted setting (which is achieved by RobustCAL) and only requires $\tilde{\mathcal{O}}(C_{\mathrm{total}})$ additional labels in the corrupted setting to achieve $\mathcal{O}(\varepsilon + \frac{C_{\mathrm{total}}}{n})$, where $\varepsilon$ is the target accuracy, $C_{\mathrm{total}}$ is the total number of corruptions and $n$ is the total number of unlabeled samples.
Cite
Text
Chen et al. "Corruption Robust Active Learning." Neural Information Processing Systems, 2021.Markdown
[Chen et al. "Corruption Robust Active Learning." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/chen2021neurips-corruption/)BibTeX
@inproceedings{chen2021neurips-corruption,
title = {{Corruption Robust Active Learning}},
author = {Chen, Yifang and Du, Simon S and Jamieson, Kevin G.},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/chen2021neurips-corruption/}
}