On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning

Abstract

A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n3), where n is the number of training examples. We develop and analyze an algorithm to compute an easily-interpretable low-rank approximation to an n × n Gram matrix G such that computations of interest may be performed more rapidly. The approximation is of the form ~Gk = CWk+CT, where C is a matrix consisting of a small number c of columns of G and Wk is the best rank-k approximation to W, the matrix formed by the intersection between those c columns of G and the corresponding c rows of G. An important aspect of the algorithm is the probability distribution used to randomly sample the columns; we will use a judiciously-chosen and data-dependent nonuniform probability distribution. Let ||·||2 and ||·||F denote the spectral norm and the Frobenius norm, respectively, of a matrix, and let Gk be the best rank-k approximation to G. We prove that by choosing O(k/ε4) columns

Cite

Text

Drineas and Mahoney. "On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning." Journal of Machine Learning Research, 2005.

Markdown

[Drineas and Mahoney. "On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning." Journal of Machine Learning Research, 2005.](https://mlanthology.org/jmlr/2005/drineas2005jmlr-nystrom/)

BibTeX

@article{drineas2005jmlr-nystrom,
  title     = {{On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning}},
  author    = {Drineas, Petros and Mahoney, Michael W.},
  journal   = {Journal of Machine Learning Research},
  year      = {2005},
  pages     = {2153-2175},
  volume    = {6},
  url       = {https://mlanthology.org/jmlr/2005/drineas2005jmlr-nystrom/}
}