Orthogonal Random Features
Abstract
We present an intriguing discovery related to Random Fourier Features: replacing multiplication by a random Gaussian matrix with multiplication by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for its effectiveness. Motivated by the discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of kernels and applications.
Cite
Text
Yu et al. "Orthogonal Random Features." Neural Information Processing Systems, 2016.Markdown
[Yu et al. "Orthogonal Random Features." Neural Information Processing Systems, 2016.](https://mlanthology.org/neurips/2016/yu2016neurips-orthogonal/)BibTeX
@inproceedings{yu2016neurips-orthogonal,
title = {{Orthogonal Random Features}},
author = {Yu, Felix Xinnan X and Suresh, Ananda Theertha and Choromanski, Krzysztof M and Holtmann-Rice, Daniel N and Kumar, Sanjiv},
booktitle = {Neural Information Processing Systems},
year = {2016},
pages = {1975-1983},
url = {https://mlanthology.org/neurips/2016/yu2016neurips-orthogonal/}
}