Learning Stochastic Motifs from Genetic Sequences
Abstract
This paper presents a methodology for learning stochastic motifs from given genetic sequences. A stochastic motif here is a probabilistic mapping from a genetic sequence (which has been drawn from a finite alphabet) to a number of categories (cytochrome c, globin, trypsin, etc.). We propose a new representation of stochastic motifs, stochastic decision predicates (SDPs) and reduce our learning problem to that of learning SDPs. We employ Rissanen's Minimum Description Length (MDL) principle in selecting an optimal hypothesis and present a detailed method for calculating description lengths relative to SDPs. Experimental results show the validity of our learning strategy.
Cite
Text
Yamanishi and Konagaya. "Learning Stochastic Motifs from Genetic Sequences." International Conference on Machine Learning, 1991. doi:10.1016/B978-1-55860-200-7.50096-9Markdown
[Yamanishi and Konagaya. "Learning Stochastic Motifs from Genetic Sequences." International Conference on Machine Learning, 1991.](https://mlanthology.org/icml/1991/yamanishi1991icml-learning/) doi:10.1016/B978-1-55860-200-7.50096-9BibTeX
@inproceedings{yamanishi1991icml-learning,
title = {{Learning Stochastic Motifs from Genetic Sequences}},
author = {Yamanishi, Kenji and Konagaya, Akihiko},
booktitle = {International Conference on Machine Learning},
year = {1991},
pages = {467-471},
doi = {10.1016/B978-1-55860-200-7.50096-9},
url = {https://mlanthology.org/icml/1991/yamanishi1991icml-learning/}
}