On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
Abstract
A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n3), where n is the number of training examples. We develop and analyze an algorithm to compute an easily-interpretable low-rank approximation to an n × n Gram matrix G such that computations of interest may be performed more rapidly. The approximation is of the form ~Gk = CWk+CT, where C is a matrix consisting of a small number c of columns of G and Wk is the best rank-k approximation to W, the matrix formed by the intersection between those c columns of G and the corresponding c rows of G. An important aspect of the algorithm is the probability distribution used to randomly sample the columns; we will use a judiciously-chosen and data-dependent nonuniform probability distribution. Let ||·||2 and ||·||F denote the spectral norm and the Frobenius norm, respectively, of a matrix, and let Gk be the best rank-k approximation to G. We prove that by choosing O(k/ε4) columns
Cite
Text
Drineas and Mahoney. "On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning." Journal of Machine Learning Research, 2005.Markdown
[Drineas and Mahoney. "On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning." Journal of Machine Learning Research, 2005.](https://mlanthology.org/jmlr/2005/drineas2005jmlr-nystrom/)BibTeX
@article{drineas2005jmlr-nystrom,
title = {{On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning}},
author = {Drineas, Petros and Mahoney, Michael W.},
journal = {Journal of Machine Learning Research},
year = {2005},
pages = {2153-2175},
volume = {6},
url = {https://mlanthology.org/jmlr/2005/drineas2005jmlr-nystrom/}
}