Infinitely Imbalanced Logistic Regression

Abstract

In binary classification problems it is common for the two classes to be imbalanced: one case is very rare compared to the other. In this paper we consider the infinitely imbalanced case where one class has a finite sample size and the other class's sample size grows without bound. For logistic regression, the infinitely imbalanced case often has a useful solution. Under mild conditions, the intercept diverges as expected, but the rest of the coefficient vector approaches a non trivial and useful limit. That limit can be expressed in terms of exponential tilting and is the minimum of a convex objective function. The limiting form of logistic regression suggests a computational shortcut for fraud detection problems.

Cite

Text

Owen. "Infinitely Imbalanced Logistic Regression." Journal of Machine Learning Research, 2007.

Markdown

[Owen. "Infinitely Imbalanced Logistic Regression." Journal of Machine Learning Research, 2007.](https://mlanthology.org/jmlr/2007/owen2007jmlr-infinitely/)

BibTeX

@article{owen2007jmlr-infinitely,
  title     = {{Infinitely Imbalanced Logistic Regression}},
  author    = {Owen, Art B.},
  journal   = {Journal of Machine Learning Research},
  year      = {2007},
  pages     = {761-773},
  volume    = {8},
  url       = {https://mlanthology.org/jmlr/2007/owen2007jmlr-infinitely/}
}