Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois

Abstract

Compression measures used in inductive learners, such as measures based on the MDL (Minimum Description Length) principle, provide a theoretically justified basis for grading candidate hypotheses. Compression-based induction is appropriate also for handling of noisy data. This paper shows that a simple compression measure can be used to detect noisy examples. A technique is proposed in which noisy examples are detected and eliminated from the training set, and a hypothesis is then built from the set of remaining examples. The separation of noise detection and hypothesis formation has the advantage that noisy examples do not influence hypothesis construction as opposed to most standard approaches to noise handling in which the learner typically tries to avoid overfitting the noisy example set. This noise elimination method is applied to a problem of early diagnosis of rheumatic diseases which is known to be a difficult problem, due both to its nature and to the imperfections in the dataset. The method is evaluated by applying the noise elimination algorithm in conjunction with the CN2 rule induction algorithm, and by comparing their performance to earlier results obtained by CN2 in this diagnostic domain.

Cite

Text

Gamberger et al. "Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois." International Conference on Algorithmic Learning Theory, 1996. doi:10.1007/3-540-61863-5_47

Markdown

[Gamberger et al. "Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois." International Conference on Algorithmic Learning Theory, 1996.](https://mlanthology.org/alt/1996/gamberger1996alt-noise/) doi:10.1007/3-540-61863-5_47

BibTeX

@inproceedings{gamberger1996alt-noise,
  title     = {{Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois}},
  author    = {Gamberger, Dragan and Lavrac, Nada and Dzeroski, Saso},
  booktitle = {International Conference on Algorithmic Learning Theory},
  year      = {1996},
  pages     = {199-212},
  doi       = {10.1007/3-540-61863-5_47},
  url       = {https://mlanthology.org/alt/1996/gamberger1996alt-noise/}
}