Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion

Abstract

The sparse inverse covariance estimation problem is commonly solved using an $\ell_{1}$-regularized Gaussian maximum likelihood estimator known as “graphical lasso”, but its computational cost becomes prohibitive for large data sets. A recently line of results showed{–}under mild assumptions{–}that the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix and solving a maximum determinant matrix completion (MDMC) problem. This paper proves an extension of this result, and describes a Newton-CG algorithm to efficiently solve the MDMC problem. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $\epsilon$-accurate solution in $O(n\log(1/\epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve the associated MDMC problems with as many as 200,000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.

Cite

Text

Zhang et al. "Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion." International Conference on Machine Learning, 2018.

Markdown

[Zhang et al. "Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/zhang2018icml-largescale/)

BibTeX

@inproceedings{zhang2018icml-largescale,
  title     = {{Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion}},
  author    = {Zhang, Richard and Fattahi, Salar and Sojoudi, Somayeh},
  booktitle = {International Conference on Machine Learning},
  year      = {2018},
  pages     = {5766-5775},
  volume    = {80},
  url       = {https://mlanthology.org/icml/2018/zhang2018icml-largescale/}
}