Statistical Optimization of Non-Negative Matrix Factorization

Abstract

Non-Negative Matrix Factorization (NMF) is a dimensionality reduction method that has been shown to be very useful for a variety of tasks in machine learning and data mining. One of the fastest algorithms for NMF is the Block Principal Pivoting method (BPP) of (Kim & Park, 2008b), which follows a block coordinate descent approach. The optimization in each iteration involves solving a large number of expensive least squares problems. Taking the view that the design matrix was generated by a stochastic process, and using the asymptotic normality of the least squares estimator, we propose a method for improving the performance of BPP. Our method starts with a small subset of the columns and rows of the original matrix and uses frequentist hypothesis tests to adaptively increase the size of the problem. This achieves two objectives: 1) during the initial phase of the algorithm we solve far fewer, much smaller sized least squares problems and 2) all hypothesis tests failing while using all the data represents a principled, automatic stopping criterion. Our experiments on three real world datasets show that our algorithm significantly improves the performance of the original BPP algorithm.

Cite

Text

Balan et al. "Statistical Optimization of Non-Negative Matrix Factorization." Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011.

Markdown

[Balan et al. "Statistical Optimization of Non-Negative Matrix Factorization." Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011.](https://mlanthology.org/aistats/2011/balan2011aistats-statistical/)

BibTeX

@inproceedings{balan2011aistats-statistical,
  title     = {{Statistical Optimization of Non-Negative Matrix Factorization}},
  author    = {Balan, Anoop Korattikara and Boyles, Levi and Welling, Max and Kim, Jingu and Park, Haesun},
  booktitle = {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics},
  year      = {2011},
  pages     = {128-136},
  volume    = {15},
  url       = {https://mlanthology.org/aistats/2011/balan2011aistats-statistical/}
}