Generalization Bounds for Decision Trees

Abstract

We derive a new bound on the error rate for decision trees. The bounds depends both on the structure of the tree and the specific sample (not just the size of the sample). This bound is tighter than traditional bounds for unbalanced trees and justifies "compositional" algorithms for constructing decision trees. 1 Introduction The problem of over-fitting is central to both the theory and practice of machine learning. Intuitively, one over-fits by using too many parameters in the concept, e.g, fitting an nth order polynomial to n data points. One under-fits by using too few parameters, e.g., fitting a linear curve to clearly quadratic data. The fundamental question is how many parameters, or what concept size, should one allow for a given amount of training data. A standard theoretical approach is to prove a bound on generalization error as a function of the training error and the concept size (or VC dimension), and then select a concept minimizing this bound, i.e., optimizing a certain...

Cite

Text

Mansour and McAllester. "Generalization Bounds for Decision Trees." Annual Conference on Computational Learning Theory, 2000.

Markdown

[Mansour and McAllester. "Generalization Bounds for Decision Trees." Annual Conference on Computational Learning Theory, 2000.](https://mlanthology.org/colt/2000/mansour2000colt-generalization/)

BibTeX

@inproceedings{mansour2000colt-generalization,
  title     = {{Generalization Bounds for Decision Trees}},
  author    = {Mansour, Yishay and McAllester, David A.},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2000},
  pages     = {69-74},
  url       = {https://mlanthology.org/colt/2000/mansour2000colt-generalization/}
}