Estimating Generalization Error Using Out-of-Bag Estimates
Abstract
We provide a method for estimating the generalization error of a bag using out-of-bag estimates. In bagging, each predictor (single hypothesis) is learned from a bootstrap sample of the training examples; the output of a bag (a set of predictors) on an example is determined by voting. The out-of-bag estimate is based on recording the votes of each predictor on those training examples omitted from its bootstrap sample. Because no additional predictors are generated, the out-of-bag estimate requires considerably less time than 10- fold cross-validation. We address the question of how to use the out-of-bag estimate to estimate generalization error. Our experiments on several datasets show that the out-of-bag estimate and 10-fold cross-validation have very inaccurate (much too optimistic) confidence levels. We can improve the out-of-bag estimate by incorporating a correction.
Cite
Text
Bylander and Hanzlik. "Estimating Generalization Error Using Out-of-Bag Estimates." AAAI Conference on Artificial Intelligence, 1999.Markdown
[Bylander and Hanzlik. "Estimating Generalization Error Using Out-of-Bag Estimates." AAAI Conference on Artificial Intelligence, 1999.](https://mlanthology.org/aaai/1999/bylander1999aaai-estimating/)BibTeX
@inproceedings{bylander1999aaai-estimating,
title = {{Estimating Generalization Error Using Out-of-Bag Estimates}},
author = {Bylander, Tom and Hanzlik, Dennis},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {1999},
pages = {321-327},
url = {https://mlanthology.org/aaai/1999/bylander1999aaai-estimating/}
}