Sparse Eigen Methods by D.C. Programming
Abstract
Eigenvalue problems are rampant in machine learning and statistics and appear in the context of classification, dimensionality reduction, etc. In this paper, we consider a cardinality constrained variational formulation of generalized eigenvalue problem with sparse principal component analysis (PCA) as a special case. Using -1 -norm approximation to the cardinality constraint, previous methods have proposed both convex and non-convex solutions to the sparse PCA problem. In contrast, we propose a tighter approximation that is related to the negative log-likelihood of a Student's t-distribution. The problem is then framed as a d.c. (difference of convex functions) program and is solved as a sequence of locally convex programs. We show that the proposed method not only explains more variance with sparse loadings on the principal directions but also has better scalability compared to other methods. We demonstrate these results on a collection of datasets of varying dimensionality, two of which are high-dimensional gene datasets where the goal is to find few relevant genes that explain as much variance as possible.
Cite
Text
Sriperumbudur et al. "Sparse Eigen Methods by D.C. Programming." International Conference on Machine Learning, 2007. doi:10.1145/1273496.1273601Markdown
[Sriperumbudur et al. "Sparse Eigen Methods by D.C. Programming." International Conference on Machine Learning, 2007.](https://mlanthology.org/icml/2007/sriperumbudur2007icml-sparse/) doi:10.1145/1273496.1273601BibTeX
@inproceedings{sriperumbudur2007icml-sparse,
title = {{Sparse Eigen Methods by D.C. Programming}},
author = {Sriperumbudur, Bharath K. and Torres, David A. and Lanckriet, Gert R. G.},
booktitle = {International Conference on Machine Learning},
year = {2007},
pages = {831-838},
doi = {10.1145/1273496.1273601},
url = {https://mlanthology.org/icml/2007/sriperumbudur2007icml-sparse/}
}