Decision Tree Pruning: Biased or Optimal?

Abstract

We evaluate the performance of weakest-link pruning of decision trees using cross-validation. This technique maps tree pruning into a problem of tree selection: Find the best (i.e. the right-sized) tree, from a set of trees ranging in size from the unpruned tree to a null tree. For samples with at least 200 cases, extensive empirical evidence supports the following conclusions relative to tree selection: (a) 10-fold cross-validation is nearly unbiased; (b) not pruning a covering tree is highly biased; (c) 10-fold cross-validation is consistent with optimal tree selection for large sample sizes and (d) the accuracy of tree selection by 10-fold cross-validation is largely dependent on sample size, irrespective of the population distribution.

Cite

Text

Weiss and Indurkhya. "Decision Tree Pruning: Biased or Optimal?." AAAI Conference on Artificial Intelligence, 1994.

Markdown

[Weiss and Indurkhya. "Decision Tree Pruning: Biased or Optimal?." AAAI Conference on Artificial Intelligence, 1994.](https://mlanthology.org/aaai/1994/weiss1994aaai-decision/)

BibTeX

@inproceedings{weiss1994aaai-decision,
  title     = {{Decision Tree Pruning: Biased or Optimal?}},
  author    = {Weiss, Sholom M. and Indurkhya, Nitin},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1994},
  pages     = {626-632},
  url       = {https://mlanthology.org/aaai/1994/weiss1994aaai-decision/}
}