PolieDRO: A Novel Classification and Regression Framework with Non-Parametric Data-Driven Regularization

Abstract

PolieDRO is a novel analytics framework for classification and regression that harnesses the power and flexibility of data-driven distributionally robust optimization (DRO) to circumvent the need for regularization hyperparameters. Recent literature shows that traditional machine learning methods such as SVM and (square-root) LASSO can be written as Wasserstein-based DRO problems. Inspired by those results we propose a hyperparameter-free ambiguity set that explores the polyhedral structure of data-driven convex hulls, generating computationally tractable regression and classification methods for any convex loss function. Numerical results based on 100 real-world databases and an extensive experiment with synthetically generated data show that our methods consistently outperform their traditional counterparts.

Cite

Text

Gutierrez et al. "PolieDRO: A Novel Classification and Regression Framework with Non-Parametric Data-Driven Regularization." Machine Learning, 2024. doi:10.1007/S10994-024-06544-9

Markdown

[Gutierrez et al. "PolieDRO: A Novel Classification and Regression Framework with Non-Parametric Data-Driven Regularization." Machine Learning, 2024.](https://mlanthology.org/mlj/2024/gutierrez2024mlj-poliedro/) doi:10.1007/S10994-024-06544-9

BibTeX

@article{gutierrez2024mlj-poliedro,
  title     = {{PolieDRO: A Novel Classification and Regression Framework with Non-Parametric Data-Driven Regularization}},
  author    = {Gutierrez, Tomás and Valladão, Davi Michel and Pagnoncelli, Bernardo K.},
  journal   = {Machine Learning},
  year      = {2024},
  pages     = {5807-5846},
  doi       = {10.1007/S10994-024-06544-9},
  volume    = {113},
  url       = {https://mlanthology.org/mlj/2024/gutierrez2024mlj-poliedro/}
}