High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Non-Convexity
Abstract
Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependencies. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing, and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently non-convex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing non-convex programs, we are able to both analyze the statistical error associated with any global optimum, and prove that a simple projected gradient descent algorithm will converge in polynomial time to a small neighborhood of the set of global minimizers. On the statistical side, we provide non-asymptotic bounds that hold with high probability for the cases of noisy, missing, and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm will converge at geometric rates to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing agreement with the predicted scalings.
Cite
Text
Loh and Wainwright. "High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Non-Convexity." Neural Information Processing Systems, 2011.Markdown
[Loh and Wainwright. "High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Non-Convexity." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/loh2011neurips-highdimensional/)BibTeX
@inproceedings{loh2011neurips-highdimensional,
title = {{High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Non-Convexity}},
author = {Loh, Po-ling and Wainwright, Martin J.},
booktitle = {Neural Information Processing Systems},
year = {2011},
pages = {2726-2734},
url = {https://mlanthology.org/neurips/2011/loh2011neurips-highdimensional/}
}