Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-Convex Optimization Approach
Abstract
We study the problem of computationally and label efficient PAC active learning $d$-dimensional halfspaces with Tsybakov Noise (Tsybakov, 2004) under structured unlabeled data distributions. Inspired by Diakonikolas et al., (2020c), we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of $\tilde{O}(d (\frac{1}{\epsilon})^{\frac{8-6\alpha}{3\alpha-1}})$, under the assumption that the Tsybakov noise parameter $\alpha \in (\frac13, 1]$, which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms (Diakonikolas et al., 2020b; Zhang and Li, 2021) and the information-theoretic lower bound in this setting.
Cite
Text
Li and Zhang. "Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-Convex Optimization Approach." Artificial Intelligence and Statistics, 2024.Markdown
[Li and Zhang. "Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-Convex Optimization Approach." Artificial Intelligence and Statistics, 2024.](https://mlanthology.org/aistats/2024/li2024aistats-efficient/)BibTeX
@inproceedings{li2024aistats-efficient,
title = {{Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-Convex Optimization Approach}},
author = {Li, Yinan and Zhang, Chicheng},
booktitle = {Artificial Intelligence and Statistics},
year = {2024},
pages = {4744-4752},
volume = {238},
url = {https://mlanthology.org/aistats/2024/li2024aistats-efficient/}
}