PAC-Bayes Risk Bounds for Stochastic Averages and Majority Votes of Sample-Compressed Classifiers

Abstract

We propose a PAC-Bayes theorem for the sample-compression setting where each classifier is described by a compression subset of the training data and a message string of additional information. This setting, which is the appropriate one to describe many learning algorithms, strictly generalizes the usual data-independent setting where classifiers are represented only by data-independent message strings (or parameters taken from a continuous set). The proposed PAC-Bayes theorem for the sample-compression setting reduces to the PAC-Bayes theorem of Seeger (2002) and Langford (2005) when the compression subset of each classifier vanishes. For posteriors having all their weights on a single sample-compressed classifier, the general risk bound reduces to a bound similar to the tight sample-compression bound proposed in Laviolette et al. (2005). Finally, we extend our results to the case where each sample-compressed classifier of a data-dependent ensemble may abstain of predicting a class label.

Cite

Text

Laviolette and Marchand. "PAC-Bayes Risk Bounds for Stochastic Averages and Majority Votes of Sample-Compressed Classifiers." Journal of Machine Learning Research, 2007.

Markdown

[Laviolette and Marchand. "PAC-Bayes Risk Bounds for Stochastic Averages and Majority Votes of Sample-Compressed Classifiers." Journal of Machine Learning Research, 2007.](https://mlanthology.org/jmlr/2007/laviolette2007jmlr-pacbayes/)

BibTeX

@article{laviolette2007jmlr-pacbayes,
  title     = {{PAC-Bayes Risk Bounds for Stochastic Averages and Majority Votes of Sample-Compressed Classifiers}},
  author    = {Laviolette, François and Marchand, Mario},
  journal   = {Journal of Machine Learning Research},
  year      = {2007},
  pages     = {1461-1487},
  volume    = {8},
  url       = {https://mlanthology.org/jmlr/2007/laviolette2007jmlr-pacbayes/}
}