Bagging Is a Small-Data-Set Phenomenon
Abstract
Bagging forms a committee of classifiers by bootstrap aggregation of training sets from a pool of training data. A simple alternative to bagging is to partition the data into disjoint subsets. Experiments on various datasets show that, given the same size partitions and bags, disjoint partitions result in better performance than bootstrap aggregates (bags). Many applications (e.g., protein structure prediction) involve the use of datasets that are too large to handle in the memory of a typical computer. Our results indicate that, in such applications, the simple approach of creating a committee of classifiers from disjoint partitions is preferred over the more complex approach of bagging.
Cite
Text
Chawla et al. "Bagging Is a Small-Data-Set Phenomenon." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2001. doi:10.1109/CVPR.2001.991030Markdown
[Chawla et al. "Bagging Is a Small-Data-Set Phenomenon." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2001.](https://mlanthology.org/cvpr/2001/chawla2001cvpr-bagging/) doi:10.1109/CVPR.2001.991030BibTeX
@inproceedings{chawla2001cvpr-bagging,
title = {{Bagging Is a Small-Data-Set Phenomenon}},
author = {Chawla, Nitesh V. and Moore, Thomas E. and Bowyer, Kevin W. and Hall, Lawrence O. and Springer, Clayton and Kegelmeyer, W. Philip},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2001},
pages = {II:684-689},
doi = {10.1109/CVPR.2001.991030},
url = {https://mlanthology.org/cvpr/2001/chawla2001cvpr-bagging/}
}