An Empirical Comparison of Selection Measures for Decision-Tree Induction

Mingers, John

doi:10.1007/BF00116837

An Empirical Comparison of Selection Measures for Decision-Tree Induction

John Mingers

MLJ 1989 pp. 319-342

doi:10.1007/BF00116837 /mlj/1989/mingers1989mlj-empirical/

Abstract

One approach to induction is to develop a decision tree from a set of examples. When used with noisy rather than deterministic data, the method involve-three main stages—creating a complete tree able to classify all the examples, pruning this tree to give statistical reliability, and processing the pruned tree to improve understandability. This paper is concerned with the first stage — tree creation which relies on a measure for “goodness of split,” that is, how well the attributes discriminate between classes. Some problems encountered at this stage are missing data and multi-valued attributes. The paper considers a number of different measures and experimentally examines their behavior in four domains. The results show that the choice of measure affects the size of a tree but not its accuracy, which remains the same even when attributes are selected randomly.

PDF MLJ Semantic Scholar

Cite

Text

Mingers. "An Empirical Comparison of Selection Measures for Decision-Tree Induction." Machine Learning, 1989. doi:10.1007/BF00116837

Markdown

[Mingers. "An Empirical Comparison of Selection Measures for Decision-Tree Induction." Machine Learning, 1989.](https://mlanthology.org/mlj/1989/mingers1989mlj-empirical/) doi:10.1007/BF00116837

BibTeX

@article{mingers1989mlj-empirical,
  title     = {{An Empirical Comparison of Selection Measures for Decision-Tree Induction}},
  author    = {Mingers, John},
  journal   = {Machine Learning},
  year      = {1989},
  pages     = {319-342},
  doi       = {10.1007/BF00116837},
  volume    = {3},
  url       = {https://mlanthology.org/mlj/1989/mingers1989mlj-empirical/}
}