A Fast, Bottom-up Decision Tree Pruning Algorithm with Near-Optimal Generalization
Abstract
In this work, we present a new bottom-up algorithm for decision tree pruning that is very efficient (requiring only a single pass through the given tree), and prove a strong performance guarantee for the generalization error of the resulting pruned tree. We work in the typical setting in which the given tree T may have been derived from the given training sample S, and thus may badly overfit S. In this setting, we give bounds on the amount of additional generalization error that our pruning suffers compared to the optimal pruning of T . More generally, our results show that if there is a pruning of T with small error, and whose size is small compared to jSj, then our algorithm will find a pruning whose error is not much larger. This style of result has been called an index of resolvability result by Barron and Cover in the context of density estimation. A novel feature of our algorithm is its locality --- the decision to prune a subtree is based entirely on properties of that subtree ...
Cite
Text
Kearns and Mansour. "A Fast, Bottom-up Decision Tree Pruning Algorithm with Near-Optimal Generalization." International Conference on Machine Learning, 1998.Markdown
[Kearns and Mansour. "A Fast, Bottom-up Decision Tree Pruning Algorithm with Near-Optimal Generalization." International Conference on Machine Learning, 1998.](https://mlanthology.org/icml/1998/kearns1998icml-fast/)BibTeX
@inproceedings{kearns1998icml-fast,
title = {{A Fast, Bottom-up Decision Tree Pruning Algorithm with Near-Optimal Generalization}},
author = {Kearns, Michael J. and Mansour, Yishay},
booktitle = {International Conference on Machine Learning},
year = {1998},
pages = {269-277},
url = {https://mlanthology.org/icml/1998/kearns1998icml-fast/}
}