Efficient Computation of Gapped Substring Kernels on Large Alphabets
Abstract
We present a sparse dynamic programming algorithm that, given two strings s and t , a gap penalty λ, and an integer p, computes the value of the gap-weighted length-p subsequences kernel. The algorithm works in time O(p |M| log |t|), where M = (i,j) | si = tj is the set of matches of characters in the two sequences. The algorithm is easily adapted to handle bounded length subsequences and different gap-penalty schemes, including penalizing by the total length of gaps and the number of gaps as well as incorporating character-specific match/gap penalties.
Cite
Text
Rousu and Shawe-Taylor. "Efficient Computation of Gapped Substring Kernels on Large Alphabets." Journal of Machine Learning Research, 2005.Markdown
[Rousu and Shawe-Taylor. "Efficient Computation of Gapped Substring Kernels on Large Alphabets." Journal of Machine Learning Research, 2005.](https://mlanthology.org/jmlr/2005/rousu2005jmlr-efficient/)BibTeX
@article{rousu2005jmlr-efficient,
title = {{Efficient Computation of Gapped Substring Kernels on Large Alphabets}},
author = {Rousu, Juho and Shawe-Taylor, John},
journal = {Journal of Machine Learning Research},
year = {2005},
pages = {1323-1344},
volume = {6},
url = {https://mlanthology.org/jmlr/2005/rousu2005jmlr-efficient/}
}