Hidden Markov Models in Molecular Biology: New Algorithms and Applications
Abstract
Hidden Markov Models (HMMs) can be applied to several impor(cid:173) tant problems in molecular biology. We introduce a new convergent learning algorithm for HMMs that, unlike the classical Baum-Welch algorithm is smooth and can be applied on-line or in batch mode, with or without the usual Viterbi most likely path approximation. Left-right HMMs with insertion and deletion states are then trained to represent several protein families including immunoglobulins and kinases. In all cases, the models derived capture all the important statistical properties of the families and can be used efficiently in a number of important tasks such as multiple alignment, motif de(cid:173) tection, and classification.