Efficient Active Learning of Sparse Halfspaces

Abstract

We study the problem of efficient PAC active learning of homogeneous linear classifiers (halfspaces) in $\mathbb{R}^d$, where the goal is to learn a halfspace with low error using as few label queries as possible. Under the extra assumption that there is a $t$-sparse halfspace that performs well on the data ($t \ll d$), we would like our active learning algorithm to be {\em attribute efficient}, i.e. to have label requirements sublinear in $d$. In this paper, we provide a computationally efficient algorithm that achieves this goal. Under certain distributional assumptions on the data, our algorithm achieves a label complexity of $O(t \cdot \mathrm{polylog}(d, \frac 1 \epsilon))$. In contrast, existing algorithms in this setting are either computationally inefficient, or subject to label requirements polynomial in $d$ or $\frac 1 \epsilon$.

Cite

Text

Zhang. "Efficient Active Learning of Sparse Halfspaces." Annual Conference on Computational Learning Theory, 2018.

Markdown

[Zhang. "Efficient Active Learning of Sparse Halfspaces." Annual Conference on Computational Learning Theory, 2018.](https://mlanthology.org/colt/2018/zhang2018colt-efficient/)

BibTeX

@inproceedings{zhang2018colt-efficient,
  title     = {{Efficient Active Learning of Sparse Halfspaces}},
  author    = {Zhang, Chicheng},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2018},
  pages     = {1856-1880},
  url       = {https://mlanthology.org/colt/2018/zhang2018colt-efficient/}
}