Black-Box K-to-1-PCA Reductions: Theory and Applications

Abstract

The $k$-principal component analysis ($k$-PCA) problem is a fundamental algorithmic primitive that is widely-used in data analysis and dimensionality reduction applications. In statistical settings, the goal of $k$-PCA is to identify a top eigenspace of the covariance matrix of a distribution, which we only have black-box access to via samples. Motivated by these settings, we analyze black-box deflation methods as a framework for designing $k$-PCA algorithms, where we model access to the unknown target matrix via a black-box $1$-PCA oracle which returns an approximate top eigenvector, under two popular notions of approximation. Despite being arguably the most natural reduction-based approach to $k$-PCA algorithm design, such black-box methods, which recursively call a $1$-PCA oracle $k$ times, were previously poorly-understood. Our main contribution is significantly sharper bounds on the approximation parameter degradation of deflation methods for $k$-PCA. For a quadratic form notion of approximation we term ePCA (energy PCA), we show deflation methods suffer no parameter loss. For an alternative well-studied approximation notion we term cPCA (correlation PCA), we tightly characterize the parameter regimes where deflation methods are feasible. Moreover, we show that in all feasible regimes, $k$-cPCA deflation algorithms suffer no asymptotic parameter loss for any constant $k$. We apply our framework to obtain state-of-the-art $k$-PCA algorithms robust to dataset contamination, improving prior work in sample complexity by a $\mathsf{poly}(k)$ factor.

Cite

Text

Jambulapati et al. "Black-Box K-to-1-PCA Reductions: Theory and Applications." Conference on Learning Theory, 2024.

Markdown

[Jambulapati et al. "Black-Box K-to-1-PCA Reductions: Theory and Applications." Conference on Learning Theory, 2024.](https://mlanthology.org/colt/2024/jambulapati2024colt-blackbox/)

BibTeX

@inproceedings{jambulapati2024colt-blackbox,
  title     = {{Black-Box K-to-1-PCA Reductions: Theory and Applications}},
  author    = {Jambulapati, Arun and Kumar, Syamantak and Li, Jerry and Pandey, Shourya and Pensia, Ankit and Tian, Kevin},
  booktitle = {Conference on Learning Theory},
  year      = {2024},
  pages     = {2564-2607},
  volume    = {247},
  url       = {https://mlanthology.org/colt/2024/jambulapati2024colt-blackbox/}
}