Sparsity vs. Large Margins for Linear Classifiers

Abstract

We provide small sample size bounds on the generalisation error of linear classifiers that take advantage of large observed margins on the training set and sparsity in the data dependent expansion coefficients. It is already known from results in the luckiness framework that both criteria independently have a large impact on the generalisation error. Our new results show that they can be combined which theoretically justifies learning algorithms like the Support Vector Machine [4] or the Relevance Vector Machine [12]. In contrast to previous studies we avoid using the classical technique of symmetrisation by a ghost sample but directly using the sparsity for the estimation of the generalisation error. We demonstrate that our result leads to practical useful results even in case of small sample size if the training set witnesses our prior belief in sparsity and large margins. 1 Introduction In this paper we present a bound on the generalisation error of linear classifiers that takes ad...

Cite

Text

Herbrich et al. "Sparsity vs. Large Margins for Linear Classifiers." Annual Conference on Computational Learning Theory, 2000.

Markdown

[Herbrich et al. "Sparsity vs. Large Margins for Linear Classifiers." Annual Conference on Computational Learning Theory, 2000.](https://mlanthology.org/colt/2000/herbrich2000colt-sparsity/)

BibTeX

@inproceedings{herbrich2000colt-sparsity,
  title     = {{Sparsity vs. Large Margins for Linear Classifiers}},
  author    = {Herbrich, Ralf and Graepel, Thore and Shawe-Taylor, John},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2000},
  pages     = {304-308},
  url       = {https://mlanthology.org/colt/2000/herbrich2000colt-sparsity/}
}