Multiple Comparisons in Induction Algorithms

Jensen, David D.; Cohen, Paul R.

doi:10.1023/A:1007631014630

Multiple Comparisons in Induction Algorithms

David D. Jensen, Paul R. Cohen

MLJ 2000 pp. 309-338

doi:10.1023/A:1007631014630 /mlj/2000/jensen2000mlj-multiple/

Abstract

A single mechanism is responsible for three pathologies of induction algorithms: attribute selection errors, overfitting, and oversearching. In each pathology, induction algorithms compare multiple items based on scores from an evaluation function and select the item with the maximum score. We call this a multiple comparison procedure (MCP). We analyze the statistical properties of MCPs and show how failure to adjust for these properties leads to the pathologies. We also discuss approaches that can control pathological behavior, including Bonferroni adjustment, randomization testing, and cross-validation.

PDF MLJ Semantic Scholar

Cite

Text

Jensen and Cohen. "Multiple Comparisons in Induction Algorithms." Machine Learning, 2000. doi:10.1023/A:1007631014630

Markdown

[Jensen and Cohen. "Multiple Comparisons in Induction Algorithms." Machine Learning, 2000.](https://mlanthology.org/mlj/2000/jensen2000mlj-multiple/) doi:10.1023/A:1007631014630

BibTeX

@article{jensen2000mlj-multiple,
  title     = {{Multiple Comparisons in Induction Algorithms}},
  author    = {Jensen, David D. and Cohen, Paul R.},
  journal   = {Machine Learning},
  year      = {2000},
  pages     = {309-338},
  doi       = {10.1023/A:1007631014630},
  volume    = {38},
  url       = {https://mlanthology.org/mlj/2000/jensen2000mlj-multiple/}
}