Variable Selection Is Hard

Dean P. Foster, Howard J. Karloff, Justin Thaler

COLT 2015 pp. 696-709

/colt/2015/foster2015colt-variable/

Abstract

Variable selection for sparse linear regression is the problem of finding, given an m p matrix B and a target vector y, a sparse vector x such that Bx approximately equals y. Assuming a standard complexity hypothesis, we show that no polynomial-time algorithm can find ak 0 -sparse x withkBx yk 2 h(m;p), wherek 0 = k 2 log 1 p andh(m;p) = p C1 m 1 C2 , where > 0;C1 > 0;C2 > 0 are arbitrary. This is true even under the promise that there is an unknownk-sparse vector x satisfyingBx = y. We prove a similar result for a statistical version of the problem in which the data are corrupted by noise. To the authors’ knowledge, these are the first hardness results for sparse regression that apply when the algorithm simultaneously hask 0 > k andh(m;p) > 0.

PDF COLT Semantic Scholar

Cite

Text

Foster et al. "Variable Selection Is Hard." Annual Conference on Computational Learning Theory, 2015.

Markdown

[Foster et al. "Variable Selection Is Hard." Annual Conference on Computational Learning Theory, 2015.](https://mlanthology.org/colt/2015/foster2015colt-variable/)

BibTeX

@inproceedings{foster2015colt-variable,
  title     = {{Variable Selection Is Hard}},
  author    = {Foster, Dean P. and Karloff, Howard J. and Thaler, Justin},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2015},
  pages     = {696-709},
  url       = {https://mlanthology.org/colt/2015/foster2015colt-variable/}
}