Kernels as Features: On Kernels, Margins, and Low-Dimensional Mappings

Abstract

Kernel functions are typically viewed as providing an implicit mapping of points into a high-dimensional space, with the ability to gain much of the power of that space without incurring a high cost if the result is linearly-separable by a large margin γ. However, the Johnson-Lindenstrauss lemma suggests that in the presence of a large margin, a kernel function can also be viewed as a mapping to a low -dimensional space, one of dimension only $\tilde{O}(1/\gamma^2)$ . In this paper, we explore the question of whether one can efficiently produce such low-dimensional mappings, using only black-box access to a kernel function. That is, given just a program that computes K ( x , y ) on inputs x , y of our choosing, can we efficiently construct an explicit (small) set of features that effectively capture the power of the implicit high-dimensional space? We answer this question in the affirmative if our method is also allowed black-box access to the underlying data distribution (i.e., unlabeled examples). We also give a lower bound, showing that if we do not have access to the distribution, then this is not possible for an arbitrary black-box kernel function; we leave as an open problem, however, whether this can be done for standard kernel functions such as the polynomial kernel. Our positive result can be viewed as saying that designing a good kernel function is much like designing a good feature space. Given a kernel, by running it in a black-box manner on random unlabeled examples, we can efficiently generate an explicit set of $\tilde{O}(1/\gamma^2)$ features, such that if the data was linearly separable with margin γ under the kernel, then it is approximately separable in this new feature space.

Cite

Text

Balcan et al. "Kernels as Features: On Kernels, Margins, and Low-Dimensional Mappings." Machine Learning, 2006. doi:10.1007/S10994-006-7550-1

Markdown

[Balcan et al. "Kernels as Features: On Kernels, Margins, and Low-Dimensional Mappings." Machine Learning, 2006.](https://mlanthology.org/mlj/2006/balcan2006mlj-kernels/) doi:10.1007/S10994-006-7550-1

BibTeX

@article{balcan2006mlj-kernels,
  title     = {{Kernels as Features: On Kernels, Margins, and Low-Dimensional Mappings}},
  author    = {Balcan, Maria-Florina and Blum, Avrim and Vempala, Santosh S.},
  journal   = {Machine Learning},
  year      = {2006},
  pages     = {79-94},
  doi       = {10.1007/S10994-006-7550-1},
  volume    = {65},
  url       = {https://mlanthology.org/mlj/2006/balcan2006mlj-kernels/}
}