Evaluating Misclassifications in Imbalanced Data

Elazmeh, William; Japkowicz, Nathalie; Matwin, Stan

doi:10.1007/11871842_16

Evaluating Misclassifications in Imbalanced Data

William Elazmeh, Nathalie Japkowicz, Stan Matwin

ECML-PKDD 2006 pp. 126-137

doi:10.1007/11871842_16 /ecmlpkdd/2006/elazmeh2006ecml-evaluating/

Abstract

Evaluating classifier performance with ROC curves is popular in the machine learning community. To date, the only method to assess confidence of ROC curves is to construct ROC bands. In the case of severe class imbalance with few instances of the minority class, ROC bands become unreliable. We propose a generic framework for classifier evaluation to identify a segment of an ROC curve in which misclassifications are balanced. Confidence is measured by Tango’s 95%-confidence interval for the difference in misclassification in both classes. We test our method with severe class imbalance in a two-class problem. Our evaluation favors classifiers with low numbers of misclassifications in both classes. Our results show that the proposed evaluation method is more confident than ROC bands.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Elazmeh et al. "Evaluating Misclassifications in Imbalanced Data." European Conference on Machine Learning, 2006. doi:10.1007/11871842_16

Markdown

[Elazmeh et al. "Evaluating Misclassifications in Imbalanced Data." European Conference on Machine Learning, 2006.](https://mlanthology.org/ecmlpkdd/2006/elazmeh2006ecml-evaluating/) doi:10.1007/11871842_16

BibTeX

@inproceedings{elazmeh2006ecml-evaluating,
  title     = {{Evaluating Misclassifications in Imbalanced Data}},
  author    = {Elazmeh, William and Japkowicz, Nathalie and Matwin, Stan},
  booktitle = {European Conference on Machine Learning},
  year      = {2006},
  pages     = {126-137},
  doi       = {10.1007/11871842_16},
  url       = {https://mlanthology.org/ecmlpkdd/2006/elazmeh2006ecml-evaluating/}
}