How Good Is a Kernel When Used as a Similarity Measure?
Abstract
Recently, Balcan and Blum [1] suggested a theory of learning based on general similarity functions, instead of positive semi-definite kernels. We study the gap between the learning guarantees based on kernel-based learning, and those that can be obtained by using the kernel as a similarity function, which was left open by Balcan and Blum. We provide a significantly improved bound on how good a kernel function is when used as a similarity function, and extend the result also to the more practically relevant hinge-loss rather then zero-one-error-rate. Furthermore, we show that this bound is tight, and hence establish that there is in-fact a real gap between the traditional kernel-based notion of margin and the newer similarity-based notion.
Cite
Text
Srebro. "How Good Is a Kernel When Used as a Similarity Measure?." Annual Conference on Computational Learning Theory, 2007. doi:10.1007/978-3-540-72927-3_24Markdown
[Srebro. "How Good Is a Kernel When Used as a Similarity Measure?." Annual Conference on Computational Learning Theory, 2007.](https://mlanthology.org/colt/2007/srebro2007colt-good/) doi:10.1007/978-3-540-72927-3_24BibTeX
@inproceedings{srebro2007colt-good,
title = {{How Good Is a Kernel When Used as a Similarity Measure?}},
author = {Srebro, Nathan},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2007},
pages = {323-335},
doi = {10.1007/978-3-540-72927-3_24},
url = {https://mlanthology.org/colt/2007/srebro2007colt-good/}
}