From Cost-Sensitive Classification to Tight F-Measure Bounds

Abstract

The F-measure is a classification performance measure, especially suited when dealing with imbalanced datasets, which provides a compromise between the precision and the recall of a classifier. As this measure is non convex and non linear, it is often indirectly optimized using cost-sensitive learning (that affects different costs to false positives and false negatives). In this article, we derive theoretical guarantees that give tight bounds on the best F-measure that can be obtained from cost-sensitive learning. We also give an original geometric interpretation of the bounds that serves as an inspiration for CONE, a new algorithm to optimize for the F-measure. Using 10 datasets exhibiting varied class imbalance, we illustrate that our bounds are much tighter than previous work and show that CONE learns models with either superior F-measures than existing methods or comparable but in fewer iterations.

Cite

Text

Bascol et al. "From Cost-Sensitive Classification to Tight F-Measure Bounds." Artificial Intelligence and Statistics, 2019.

Markdown

[Bascol et al. "From Cost-Sensitive Classification to Tight F-Measure Bounds." Artificial Intelligence and Statistics, 2019.](https://mlanthology.org/aistats/2019/bascol2019aistats-costsensitive/)

BibTeX

@inproceedings{bascol2019aistats-costsensitive,
  title     = {{From Cost-Sensitive Classification to Tight F-Measure Bounds}},
  author    = {Bascol, Kevin and Emonet, Rémi and Fromont, Elisa and Habrard, Amaury and Metzler, Guillaume and Sebban, Marc},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2019},
  pages     = {1245-1253},
  volume    = {89},
  url       = {https://mlanthology.org/aistats/2019/bascol2019aistats-costsensitive/}
}