Evaluation Measures for Multi-Class Subgroup Discovery

Abstract

Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. It has previously predominantly been investigated in a two-class context. This paper investigates multi-class subgroup discovery methods. We consider six evaluation measures for multi-class subgroups, four of them new, and study their theoretical properties. We extend the two-class subgroup discovery algorithm CN2-SD to incorporate the new evaluation measures and a new weighting scheme inspired by AdaBoost. We demonstrate the usefulness of multi-class subgroup discovery experimentally, using discovered subgroups as features for a decision tree learner. Not only is the number of leaves of the decision tree reduced with a factor between 8 and 16 on average, but significant improvements in accuracy and AUC are achieved with particular evaluation measures and settings. Similar performance improvements can be observed when using naive Bayes.

Cite

Text

Abudawood and Flach. "Evaluation Measures for Multi-Class Subgroup Discovery." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2009. doi:10.1007/978-3-642-04180-8_20

Markdown

[Abudawood and Flach. "Evaluation Measures for Multi-Class Subgroup Discovery." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2009.](https://mlanthology.org/ecmlpkdd/2009/abudawood2009ecmlpkdd-evaluation/) doi:10.1007/978-3-642-04180-8_20

BibTeX

@inproceedings{abudawood2009ecmlpkdd-evaluation,
  title     = {{Evaluation Measures for Multi-Class Subgroup Discovery}},
  author    = {Abudawood, Tarek and Flach, Peter A.},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2009},
  pages     = {35-50},
  doi       = {10.1007/978-3-642-04180-8_20},
  url       = {https://mlanthology.org/ecmlpkdd/2009/abudawood2009ecmlpkdd-evaluation/}
}