Tight Bounds for Maximum $\ell_1$-Margin Classifiers

Abstract

Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|\w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.

Cite

Text

Stojanovic et al. "Tight Bounds for Maximum $\ell_1$-Margin Classifiers." Proceedings of The 35th International Conference on Algorithmic Learning Theory, 2024.

Markdown

[Stojanovic et al. "Tight Bounds for Maximum $\ell_1$-Margin Classifiers." Proceedings of The 35th International Conference on Algorithmic Learning Theory, 2024.](https://mlanthology.org/alt/2024/stojanovic2024alt-tight/)

BibTeX

@inproceedings{stojanovic2024alt-tight,
  title     = {{Tight Bounds for Maximum $\ell_1$-Margin Classifiers}},
  author    = {Stojanovic, Stefan and Donhauser, Konstantin and Yang, Fanny},
  booktitle = {Proceedings of The 35th International Conference on Algorithmic Learning Theory},
  year      = {2024},
  pages     = {1055-1112},
  volume    = {237},
  url       = {https://mlanthology.org/alt/2024/stojanovic2024alt-tight/}
}