Mismatch String Kernels for SVM Protein Classification

Eleazar Eskin, Jason Weston, William S. Noble, Christina S. Leslie

NeurIPS 2002 pp. 1441-1448

/neurips/2002/eskin2002neurips-mismatch/

Abstract

We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the protein classiﬁcation problem. These kernels measure sequence sim- ilarity based on shared occurrences of -length subsequences, counted with up to mismatches, and do not rely on any generative model for the positive training sequences. We compute the kernels efﬁciently using a mismatch tree data structure and report experiments on a benchmark SCOP dataset, where we show that the mismatch kernel used with an SVM classiﬁer performs as well as the Fisher kernel, the most success- ful method for remote homology detection, while achieving considerable computational savings.

PDF NeurIPS Semantic Scholar

Cite

Text

Eskin et al. "Mismatch String Kernels for SVM Protein Classification." Neural Information Processing Systems, 2002.

Markdown

[Eskin et al. "Mismatch String Kernels for SVM Protein Classification." Neural Information Processing Systems, 2002.](https://mlanthology.org/neurips/2002/eskin2002neurips-mismatch/)

BibTeX

@inproceedings{eskin2002neurips-mismatch,
  title     = {{Mismatch String Kernels for SVM Protein Classification}},
  author    = {Eskin, Eleazar and Weston, Jason and Noble, William S. and Leslie, Christina S.},
  booktitle = {Neural Information Processing Systems},
  year      = {2002},
  pages     = {1441-1448},
  url       = {https://mlanthology.org/neurips/2002/eskin2002neurips-mismatch/}
}