The Backbone Method for Ultra-High Dimensional Sparse Machine Learning

MLJ 2022 pp. 2161-2212

doi:10.1007/S10994-021-06123-2 /mlj/2022/bertsimas2022mlj-backbone/

Abstract

We present the backbone method, a general framework that enables sparse and interpretable supervised machine learning methods to scale to ultra-high dimensional problems. We solve sparse regression problems with 107\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$10^7$\end{document} features in minutes and 108\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$10^8$\end{document} features in hours, as well as decision tree problems with 105\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$10^5$\end{document} features in minutes. The proposed method operates in two phases: we first determine the backbone set, consisting of potentially relevant features, by solving a number of tractable subproblems; then, we solve a reduced problem, considering only the backbone features. For the sparse regression problem, our theoretical analysis shows that, under certain assumptions and with high probability, the backbone set consists of the truly relevant features. Numerical experiments on both synthetic and real-world datasets demonstrate that our method outperforms or competes with state-of-the-art methods in ultra-high dimensional problems, and competes with optimal solutions in problems where exact methods scale, both in terms of recovering the truly relevant features and in its out-of-sample predictive performance.

PDF MLJ Semantic Scholar

Cite

Text

Bertsimas and Digalakis. "The Backbone Method for Ultra-High Dimensional Sparse Machine Learning." Machine Learning, 2022. doi:10.1007/S10994-021-06123-2

Markdown

[Bertsimas and Digalakis. "The Backbone Method for Ultra-High Dimensional Sparse Machine Learning." Machine Learning, 2022.](https://mlanthology.org/mlj/2022/bertsimas2022mlj-backbone/) doi:10.1007/S10994-021-06123-2

BibTeX

@article{bertsimas2022mlj-backbone,
  title     = {{The Backbone Method for Ultra-High Dimensional Sparse Machine Learning}},
  author    = {Bertsimas, Dimitris and Digalakis, Vassilis},
  journal   = {Machine Learning},
  year      = {2022},
  pages     = {2161-2212},
  doi       = {10.1007/S10994-021-06123-2},
  volume    = {111},
  url       = {https://mlanthology.org/mlj/2022/bertsimas2022mlj-backbone/}
}