Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results
Abstract
One of the nice properties of kernel classifiers such as SVMs is that they often produce sparse solutions. However, the decision functions of these classifiers cannot always be used to estimate the conditional probability of the class label. We investigate the relationship between these two properties and show that these are intimately related: sparseness does not occur when the conditional probabilities can be unambiguously estimated. We consider a family of convex loss functions and derive sharp asymptotic bounds for the number of support vectors. This enables us to characterize the exact trade-off between sparseness and the ability to estimate conditional probabilities for these loss functions.
Cite
Text
Bartlett and Tewari. "Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results." Annual Conference on Computational Learning Theory, 2004. doi:10.1007/978-3-540-27819-1_39Markdown
[Bartlett and Tewari. "Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results." Annual Conference on Computational Learning Theory, 2004.](https://mlanthology.org/colt/2004/bartlett2004colt-sparseness/) doi:10.1007/978-3-540-27819-1_39BibTeX
@inproceedings{bartlett2004colt-sparseness,
title = {{Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results}},
author = {Bartlett, Peter L. and Tewari, Ambuj},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2004},
pages = {564-578},
doi = {10.1007/978-3-540-27819-1_39},
url = {https://mlanthology.org/colt/2004/bartlett2004colt-sparseness/}
}