A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs Work
Abstract
We present a bound on the generalisation error of linear classifiers in terms of a refined margin quantity on the training set. The result is obtained in a PAC- Bayesian framework and is based on geometrical arguments in the space of linear classifiers. The new bound constitutes an exponential improvement of the so far tightest margin bound by Shawe-Taylor et al. [8] and scales logarithmically in the inverse margin. Even in the case of less training examples than input dimensions sufficiently large margins lead to non-trivial bound values and - plexity term. Furthermore, the classical margin is too coarse a measure for the essential quantity that controls the generalisation error: the volume ratio between the whole hypothesis space and the subset of consistent hypotheses. The practical relevance of the result lies in the fact that the well-known support vector machine is optimal w.r.t. the new bound only if the feature vectors are all of the same length. As a consequence we recommend to use SVMs on normalised feature vectors only - a recommendation that is well supported by our numerical experiments on two benchmark data sets.
Cite
Text
Herbrich and Graepel. "A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs Work." Neural Information Processing Systems, 2000.Markdown
[Herbrich and Graepel. "A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs Work." Neural Information Processing Systems, 2000.](https://mlanthology.org/neurips/2000/herbrich2000neurips-pacbayesian/)BibTeX
@inproceedings{herbrich2000neurips-pacbayesian,
title = {{A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs Work}},
author = {Herbrich, Ralf and Graepel, Thore},
booktitle = {Neural Information Processing Systems},
year = {2000},
pages = {224-230},
url = {https://mlanthology.org/neurips/2000/herbrich2000neurips-pacbayesian/}
}