Estimating Generalization Error Using Out-of-Bag Estimates

Abstract

We provide a method for estimating the generalization error of a bag using out-of-bag estimates. In bagging, each predictor (single hypothesis) is learned from a bootstrap sample of the training examples; the output of a bag (a set of predictors) on an example is determined by voting. The out-of-bag estimate is based on recording the votes of each predictor on those training examples omitted from its bootstrap sample. Because no additional predictors are generated, the out-of-bag estimate requires considerably less time than 10- fold cross-validation. We address the question of how to use the out-of-bag estimate to estimate generalization error. Our experiments on several datasets show that the out-of-bag estimate and 10-fold cross-validation have very inaccurate (much too optimistic) confidence levels. We can improve the out-of-bag estimate by incorporating a correction.

Cite

Text

Bylander and Hanzlik. "Estimating Generalization Error Using Out-of-Bag Estimates." AAAI Conference on Artificial Intelligence, 1999.

Markdown

[Bylander and Hanzlik. "Estimating Generalization Error Using Out-of-Bag Estimates." AAAI Conference on Artificial Intelligence, 1999.](https://mlanthology.org/aaai/1999/bylander1999aaai-estimating/)

BibTeX

@inproceedings{bylander1999aaai-estimating,
  title     = {{Estimating Generalization Error Using Out-of-Bag Estimates}},
  author    = {Bylander, Tom and Hanzlik, Dennis},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1999},
  pages     = {321-327},
  url       = {https://mlanthology.org/aaai/1999/bylander1999aaai-estimating/}
}