Class-Driven Statistical Discretization of Continuous Attributes (Extended Abstract)

Abstract

Discretization is a pre-processing step of the learning task which offers cognitive benefits as well as computational ones. This paper describes StatDisc, a statistical algorithm that supports supervised learning by performing class-driven discretization. StatDisc provides a concise summarization of continuous attributes by investigating the data composition, i.e., by discovering intervals of the numeric attribute values wherein examples feature distribution of classes homogeneous and strongly contrasting with the distribution of other intervals. Experimental results from a variety of domains confirm that discretizing real attributes causes little loss of learning accuracy while offering large reduction in learning time.

Cite

Text

Richeldi and Rossotto. "Class-Driven Statistical Discretization of Continuous Attributes (Extended Abstract)." European Conference on Machine Learning, 1995. doi:10.1007/3-540-59286-5_81

Markdown

[Richeldi and Rossotto. "Class-Driven Statistical Discretization of Continuous Attributes (Extended Abstract)." European Conference on Machine Learning, 1995.](https://mlanthology.org/ecmlpkdd/1995/richeldi1995ecml-classdriven/) doi:10.1007/3-540-59286-5_81

BibTeX

@inproceedings{richeldi1995ecml-classdriven,
  title     = {{Class-Driven Statistical Discretization of Continuous Attributes (Extended Abstract)}},
  author    = {Richeldi, Marco and Rossotto, Mauro},
  booktitle = {European Conference on Machine Learning},
  year      = {1995},
  pages     = {335-338},
  doi       = {10.1007/3-540-59286-5_81},
  url       = {https://mlanthology.org/ecmlpkdd/1995/richeldi1995ecml-classdriven/}
}