Feature Adaptation for Sparse Linear Regression

Abstract

Sparse linear regression is a central problem in high-dimensional statistics. We study the correlated random design setting, where the covariates are drawn from a multivariate Gaussian $N(0,\Sigma)$, and we seek an estimator with small excess risk. If the true signal is $t$-sparse, information-theoretically, it is possible to achieve strong recovery guarantees with only $O(t\log n)$ samples. However, computationally efficient algorithms have sample complexity linear in (some variant of) the *condition number* of $\Sigma$. Classical algorithms such as the Lasso can require significantly more samples than necessary even if there is only a single sparse approximate dependency among the covariates.We provide a polynomial-time algorithm that, given $\Sigma$, automatically adapts the Lasso to tolerate a small number of approximate dependencies. In particular, we achieve near-optimal sample complexity for constant sparsity and if $\Sigma$ has few ``outlier'' eigenvalues.Our algorithm fits into a broader framework of *feature adaptation* for sparse linear regression with ill-conditioned covariates. With this framework, we additionally provide the first polynomial-factor improvement over brute-force search for constant sparsity $t$ and arbitrary covariance $\Sigma$.

Cite

Text

Kelner et al. "Feature Adaptation for Sparse Linear Regression." Neural Information Processing Systems, 2023.

Markdown

[Kelner et al. "Feature Adaptation for Sparse Linear Regression." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/kelner2023neurips-feature/)

BibTeX

@inproceedings{kelner2023neurips-feature,
  title     = {{Feature Adaptation for Sparse Linear Regression}},
  author    = {Kelner, Jonathan and Koehler, Frederic and Meka, Raghu and Rohatgi, Dhruv},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/kelner2023neurips-feature/}
}