Subgroup Discovery with Proper Scoring Rules
Abstract
Subgroup Discovery is the process of finding and describing sufficiently large subsets of a given population that have unusual distributional characteristics with regard to some target attribute. Such subgroups can be used as a statistical summary which improves on the default summary of stating the overall distribution in the population. A natural way to evaluate such summaries is to quantify the difference between predicted and empirical distribution of the target. In this paper we propose to use proper scoring rules, a well-known family of evaluation measures for assessing the goodness of probability estimators, to obtain theoretically well-founded evaluation measures for subgroup discovery. From this perspective, one subgroup is better than another if it has lower divergence of target probability estimates from the actual labels on average. We demonstrate empirically on both synthetic and real-world data that this leads to higher quality statistical summaries than the existing methods based on measures such as Weighted Relative Accuracy.
Cite
Text
Song et al. "Subgroup Discovery with Proper Scoring Rules." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016. doi:10.1007/978-3-319-46227-1_31Markdown
[Song et al. "Subgroup Discovery with Proper Scoring Rules." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.](https://mlanthology.org/ecmlpkdd/2016/song2016ecmlpkdd-subgroup/) doi:10.1007/978-3-319-46227-1_31BibTeX
@inproceedings{song2016ecmlpkdd-subgroup,
title = {{Subgroup Discovery with Proper Scoring Rules}},
author = {Song, Hao and Kull, Meelis and Flach, Peter A. and Kalogridis, Georgios},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2016},
pages = {492-510},
doi = {10.1007/978-3-319-46227-1_31},
url = {https://mlanthology.org/ecmlpkdd/2016/song2016ecmlpkdd-subgroup/}
}