Full Regularization Path for Sparse Principal Component Analysis

Abstract

Given a sample covariance matrix, we examine the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coeffcients in this combination. This is known as sparse principal component analysis and has a wide array of applications in machine learning and engineering. We formulate a new semidefinite relaxation to this problem and derive a greedy algorithm that computes a full set of good solutions for all numbers of non zero coeffcients, with complexity O(n^3), where n is the number of variables. We then use the same relaxation to derive suffcient conditions for global optimality of a solution, which can be tested in O(n^3). We show on toy examples and biological data that our algorithm does provide globally optimal solutions in many cases.

Cite

Text

d'Aspremont et al. "Full Regularization Path for Sparse Principal Component Analysis." International Conference on Machine Learning, 2007. doi:10.1145/1273496.1273519

Markdown

[d'Aspremont et al. "Full Regularization Path for Sparse Principal Component Analysis." International Conference on Machine Learning, 2007.](https://mlanthology.org/icml/2007/daposaspremont2007icml-full/) doi:10.1145/1273496.1273519

BibTeX

@inproceedings{daposaspremont2007icml-full,
  title     = {{Full Regularization Path for Sparse Principal Component Analysis}},
  author    = {d'Aspremont, Alexandre and Bach, Francis R. and El Ghaoui, Laurent},
  booktitle = {International Conference on Machine Learning},
  year      = {2007},
  pages     = {177-184},
  doi       = {10.1145/1273496.1273519},
  url       = {https://mlanthology.org/icml/2007/daposaspremont2007icml-full/}
}