Compression-Based Pruning of Decision Lists

Abstract

We define a formula for estimating the coding costs of decision lists for propositional domains. This formula allows for multiple classes and both categorical and numerical attributes. For artificial domains the formula performs quite satisfactory, whereas results are rather mixed and inconclusive for natural domains. Further experiments lead to a principled simplification of the original formula which is robust in both artificial and natural domains. Simple hill-climbing search for the most compressive decision list significantly reduces the complexity of a given decision list while not impeding and sometimes even improving its predictive accuracy.

Cite

Text

Pfahringer. "Compression-Based Pruning of Decision Lists." European Conference on Machine Learning, 1997. doi:10.1007/3-540-62858-4_85

Markdown

[Pfahringer. "Compression-Based Pruning of Decision Lists." European Conference on Machine Learning, 1997.](https://mlanthology.org/ecmlpkdd/1997/pfahringer1997ecml-compressionbased/) doi:10.1007/3-540-62858-4_85

BibTeX

@inproceedings{pfahringer1997ecml-compressionbased,
  title     = {{Compression-Based Pruning of Decision Lists}},
  author    = {Pfahringer, Bernhard},
  booktitle = {European Conference on Machine Learning},
  year      = {1997},
  pages     = {199-212},
  doi       = {10.1007/3-540-62858-4_85},
  url       = {https://mlanthology.org/ecmlpkdd/1997/pfahringer1997ecml-compressionbased/}
}