Fast Effective Rule Induction

Abstract

Many existing rule learning systems are computationally expensive on large noisy datasets. In this paper we evaluate the recently-proposed rule learning algorithm IREP on a large and diverse collection of benchmark problems. We show that while IREP is extremely efficient, it frequently gives error rates higher than those of C4.5 and C4.5rules. We then propose a number of modifications resulting in an algorithm RIPPERk that is very competitive with C4.5rules with respect to error rates, but much more efficient on large samples. RIPPERk obtains error rates lower than or equivalent to C4.5rules on 22 of 37 benchmark problems, scales nearly linearly with the number of training examples, and can efficiently process noisy datasets containing hundreds of thousands of examples.

Cite

Text

Cohen. "Fast Effective Rule Induction." International Conference on Machine Learning, 1995. doi:10.1016/B978-1-55860-377-6.50023-2

Markdown

[Cohen. "Fast Effective Rule Induction." International Conference on Machine Learning, 1995.](https://mlanthology.org/icml/1995/cohen1995icml-fast/) doi:10.1016/B978-1-55860-377-6.50023-2

BibTeX

@inproceedings{cohen1995icml-fast,
  title     = {{Fast Effective Rule Induction}},
  author    = {Cohen, William W.},
  booktitle = {International Conference on Machine Learning},
  year      = {1995},
  pages     = {115-123},
  doi       = {10.1016/B978-1-55860-377-6.50023-2},
  url       = {https://mlanthology.org/icml/1995/cohen1995icml-fast/}
}