Wasserstein Logistic Regression with Mixed Features

Abstract

Recent work has leveraged the popular distributionally robust optimization paradigm to combat overfitting in classical logistic regression. While the resulting classification scheme displays a promising performance in numerical experiments, it is inherently limited to numerical features. In this paper, we show that distributionally robust logistic regression with mixed (\emph{i.e.}, numerical and categorical) features, despite amounting to an optimization problem of exponential size, admits a polynomial-time solution scheme. We subsequently develop a practically efficient cutting plane approach that solves the problem as a sequence of polynomial-time solvable exponential conic programs. Our method retains many of the desirable theoretical features of previous works, but---in contrast to the literature---it does not admit an equivalent representation as a regularized logistic regression, that is, it represents a genuinely novel variant of the logistic regression problem. We show that our method outperforms both the unregularized and the regularized logistic regression on categorical as well as mixed-feature benchmark instances.

Cite

Text

Selvi et al. "Wasserstein Logistic Regression with Mixed Features." Neural Information Processing Systems, 2022.

Markdown

[Selvi et al. "Wasserstein Logistic Regression with Mixed Features." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/selvi2022neurips-wasserstein/)

BibTeX

@inproceedings{selvi2022neurips-wasserstein,
  title     = {{Wasserstein Logistic Regression with Mixed Features}},
  author    = {Selvi, Aras and Belbasi, Mohammad Reza and Haugh, Martin and Wiesemann, Wolfram},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/selvi2022neurips-wasserstein/}
}