Bagging Using Statistical Queries

Van Assche, Anneleen; Blockeel, Hendrik

doi:10.1007/11871842_83

Bagging Using Statistical Queries

Anneleen Van Assche, Hendrik Blockeel

ECML-PKDD 2006 pp. 809-816

doi:10.1007/11871842_83 /ecmlpkdd/2006/assche2006ecml-bagging/

Abstract

Bagging is an ensemble method that relies on random resampling of a data set to construct models for the ensemble. When only statistics about the data are available, but no individual examples, the straightforward resampling procedure cannot be implemented. The question is then whether bagging can somehow be simulated. In this paper we propose a method that, instead of computing certain heuristics (such as information gain) from a resampled version of the data, estimates the probability distribution of these heuristics under random resampling, and then samples from this distribution. The resulting method is not entirely equivalent to bagging because it ignores certain dependencies among statistics. Nevertheless, experiments show that this “simulated bagging” yields similar accuracy as bagging, while being as efficient and more generally applicable.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Van Assche and Blockeel. "Bagging Using Statistical Queries." European Conference on Machine Learning, 2006. doi:10.1007/11871842_83

Markdown

[Van Assche and Blockeel. "Bagging Using Statistical Queries." European Conference on Machine Learning, 2006.](https://mlanthology.org/ecmlpkdd/2006/assche2006ecml-bagging/) doi:10.1007/11871842_83

BibTeX

@inproceedings{assche2006ecml-bagging,
  title     = {{Bagging Using Statistical Queries}},
  author    = {Van Assche, Anneleen and Blockeel, Hendrik},
  booktitle = {European Conference on Machine Learning},
  year      = {2006},
  pages     = {809-816},
  doi       = {10.1007/11871842_83},
  url       = {https://mlanthology.org/ecmlpkdd/2006/assche2006ecml-bagging/}
}