Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-Convex Optimization Approach

Abstract

We study the problem of computationally and label efficient PAC active learning $d$-dimensional halfspaces with Tsybakov Noise (Tsybakov, 2004) under structured unlabeled data distributions. Inspired by Diakonikolas et al., (2020c), we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of $\tilde{O}(d (\frac{1}{\epsilon})^{\frac{8-6\alpha}{3\alpha-1}})$, under the assumption that the Tsybakov noise parameter $\alpha \in (\frac13, 1]$, which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms (Diakonikolas et al., 2020b; Zhang and Li, 2021) and the information-theoretic lower bound in this setting.

Cite

Text

Li and Zhang. "Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-Convex Optimization Approach." Artificial Intelligence and Statistics, 2024.

Markdown

[Li and Zhang. "Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-Convex Optimization Approach." Artificial Intelligence and Statistics, 2024.](https://mlanthology.org/aistats/2024/li2024aistats-efficient/)

BibTeX

@inproceedings{li2024aistats-efficient,
  title     = {{Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-Convex Optimization Approach}},
  author    = {Li, Yinan and Zhang, Chicheng},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2024},
  pages     = {4744-4752},
  volume    = {238},
  url       = {https://mlanthology.org/aistats/2024/li2024aistats-efficient/}
}