Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities

Abstract

I describe a framework for interpreting Support Vector Machines (SVMs) as maximum a posteriori (MAP) solutions to inference problems with Gaussian Process priors. This probabilistic interpretation can provide intuitive guidelines for choosing a ‘good’ SVM kernel. Beyond this, it allows Bayesian methods to be used for tackling two of the outstanding challenges in SVM classification: how to tune hyperparameters—the misclassification penalty C , and any parameters specifying the ernel—and how to obtain predictive class probabilities rather than the conventional deterministic class label predictions. Hyperparameters can be set by maximizing the evidence; I explain how the latter can be defined and properly normalized. Both analytical approximations and numerical methods (Monte Carlo chaining) for estimating the evidence are discussed. I also compare different methods of estimating class probabilities, ranging from simple evaluation at the MAP or at the posterior average to full averaging over the posterior. A simple toy application illustrates the various concepts and techniques.

Cite

Text

Sollich. "Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities." Machine Learning, 2002. doi:10.1023/A:1012489924661

Markdown

[Sollich. "Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities." Machine Learning, 2002.](https://mlanthology.org/mlj/2002/sollich2002mlj-bayesian/) doi:10.1023/A:1012489924661

BibTeX

@article{sollich2002mlj-bayesian,
  title     = {{Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities}},
  author    = {Sollich, Peter},
  journal   = {Machine Learning},
  year      = {2002},
  pages     = {21-52},
  doi       = {10.1023/A:1012489924661},
  volume    = {46},
  url       = {https://mlanthology.org/mlj/2002/sollich2002mlj-bayesian/}
}