Nonparametric Regularization of Decision Trees
Abstract
We discuss the problem of choosing the complexity of a decision tree (measured in the number of leaf nodes) that gives us highest generalization performance. We first discuss an analysis of the generalization error of decision trees that gives us a new perspective on the regularization parameter that is inherent to any regularization ( e.g. , pruning) algorithm. There is an optimal setting of this parameter for every learning problem; a setting that does well for one problem will inevitably do poorly for others. We will see that the optimal setting can in fact be estimated from the sample, without “trying out” various settings on holdout data. This leads us to a nonparametric decision tree regularization algorithm that can, in principle, work well for all learning problems.
Cite
Text
Scheffer. "Nonparametric Regularization of Decision Trees." European Conference on Machine Learning, 2000. doi:10.1007/3-540-45164-1_36Markdown
[Scheffer. "Nonparametric Regularization of Decision Trees." European Conference on Machine Learning, 2000.](https://mlanthology.org/ecmlpkdd/2000/scheffer2000ecml-nonparametric/) doi:10.1007/3-540-45164-1_36BibTeX
@inproceedings{scheffer2000ecml-nonparametric,
title = {{Nonparametric Regularization of Decision Trees}},
author = {Scheffer, Tobias},
booktitle = {European Conference on Machine Learning},
year = {2000},
pages = {344-356},
doi = {10.1007/3-540-45164-1_36},
url = {https://mlanthology.org/ecmlpkdd/2000/scheffer2000ecml-nonparametric/}
}