Data-Dependent Margin-Based Generalization Bounds for Classification
Abstract
We derive new margin-based inequalities for the probability of error of classifiers. The main feature of these bounds is that they can be calculated using the training data and therefore may be effectively used for model selection purposes. In particular, the bounds involve quantities such as the empirical fat-shattering dimension and covering number measured on the training data, as opposed to their worst-case counterparts traditionally used in such analyses, and appear to be sharper and more general than recent results involving empirical complexity measures. In addition, we also develop an alternative data-based bound for the generalization error of classes of convex combinations of classifiers involving an empirical complexity measure that is more easily computable than the empirical covering number or fat-shattering dimension. W e also show an example of efficient computation of the new bounds.
Cite
Text
Kégl et al. "Data-Dependent Margin-Based Generalization Bounds for Classification." Annual Conference on Computational Learning Theory, 2001. doi:10.1007/3-540-44581-1_24Markdown
[Kégl et al. "Data-Dependent Margin-Based Generalization Bounds for Classification." Annual Conference on Computational Learning Theory, 2001.](https://mlanthology.org/colt/2001/kegl2001colt-data/) doi:10.1007/3-540-44581-1_24BibTeX
@inproceedings{kegl2001colt-data,
title = {{Data-Dependent Margin-Based Generalization Bounds for Classification}},
author = {Kégl, Balázs and Linder, Tamás and Lugosi, Gábor},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2001},
pages = {368-384},
doi = {10.1007/3-540-44581-1_24},
url = {https://mlanthology.org/colt/2001/kegl2001colt-data/}
}