Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance

Abstract

This paper is concerned with computationally efficient learning of homogeneous sparse halfspaces in $\mathbb{R}^d$ under noise. Though recent works have established attribute-efficient learning algorithms under various types of label noise (e.g. bounded noise), it remains an open question of when and how $s$-sparse halfspaces can be efficiently learned under the challenging {\em malicious noise} model, where an adversary may corrupt both the unlabeled examples and the labels. We answer this question in the affirmative by designing a computationally efficient active learning algorithm with near-optimal label complexity of $\tilde{O}(s \log^4\frac{d}{\epsilon})$ and noise tolerance $\eta = \Omega(\epsilon)$, where $\epsilon \in (0, 1)$ is the target error rate, under the assumption that the distribution over (uncorrupted) unlabeled examples is isotropic log-concave. Our algorithm can be straightforwardly tailored to the passive learning setting, and we show that the sample complexity is $\tilde{O}(\frac{1}{\epsilon}s^2 \log^5 d)$ which also enjoys attribute efficiency. Our main techniques include attribute-efficient paradigms for soft outlier removal and for empirical risk minimization, and a new analysis of uniform concentration for unbounded instances – all of them crucially take the sparsity structure of the underlying halfspace into account.

Cite

Text

Shen and Zhang. "Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance." Proceedings of the 32nd International Conference on Algorithmic Learning Theory, 2021.

Markdown

[Shen and Zhang. "Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance." Proceedings of the 32nd International Conference on Algorithmic Learning Theory, 2021.](https://mlanthology.org/alt/2021/shen2021alt-attributeefficient/)

BibTeX

@inproceedings{shen2021alt-attributeefficient,
  title     = {{Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance}},
  author    = {Shen, Jie and Zhang, Chicheng},
  booktitle = {Proceedings of the 32nd International Conference on Algorithmic Learning Theory},
  year      = {2021},
  pages     = {1072-1113},
  volume    = {132},
  url       = {https://mlanthology.org/alt/2021/shen2021alt-attributeefficient/}
}