High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Non-Convexity

Abstract

Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependencies. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing, and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently non-convex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing non-convex programs, we are able to both analyze the statistical error associated with any global optimum, and prove that a simple projected gradient descent algorithm will converge in polynomial time to a small neighborhood of the set of global minimizers. On the statistical side, we provide non-asymptotic bounds that hold with high probability for the cases of noisy, missing, and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm will converge at geometric rates to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing agreement with the predicted scalings.

Cite

Text

Loh and Wainwright. "High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Non-Convexity." Neural Information Processing Systems, 2011.

Markdown

[Loh and Wainwright. "High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Non-Convexity." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/loh2011neurips-highdimensional/)

BibTeX

@inproceedings{loh2011neurips-highdimensional,
  title     = {{High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Non-Convexity}},
  author    = {Loh, Po-ling and Wainwright, Martin J.},
  booktitle = {Neural Information Processing Systems},
  year      = {2011},
  pages     = {2726-2734},
  url       = {https://mlanthology.org/neurips/2011/loh2011neurips-highdimensional/}
}