Predicting Protein Folds with Structural Repeats Using a Chain Graph Model

Abstract

Protein fold recognition is a key step towards inferring the tertiary structures from amino-acid sequences. Complex folds such as those consisting of interacting structural repeats are prevalent in proteins involved in a wide spectrum of biological functions. However, extant approaches often perform inadequately due to their inability to capture long-range interactions between structural units and to handle low sequence similarities across proteins (under 25% identity). In this paper, we propose a chain graph model built on a causally connected series of segmentation conditional random fields (SCRFs) to address these issues. Specifically, the SCRF model captures long-range interactions within recurring structural units and the Bayesian network backbone decomposes cross-repeat interactions into locally computable modules consisting of repeat-specific SCRFs and a model for sequence motifs. We applied this model to predict β-helices and leucine-rich repeats, and found it significantly outperforms extant methods in predictive accuracy and/or computational efficiency.

Cite

Text

Liu et al. "Predicting Protein Folds with Structural Repeats Using a Chain Graph Model." International Conference on Machine Learning, 2005. doi:10.1145/1102351.1102416

Markdown

[Liu et al. "Predicting Protein Folds with Structural Repeats Using a Chain Graph Model." International Conference on Machine Learning, 2005.](https://mlanthology.org/icml/2005/liu2005icml-predicting/) doi:10.1145/1102351.1102416

BibTeX

@inproceedings{liu2005icml-predicting,
  title     = {{Predicting Protein Folds with Structural Repeats Using a Chain Graph Model}},
  author    = {Liu, Yan and Xing, Eric P. and Carbonell, Jaime G.},
  booktitle = {International Conference on Machine Learning},
  year      = {2005},
  pages     = {513-520},
  doi       = {10.1145/1102351.1102416},
  url       = {https://mlanthology.org/icml/2005/liu2005icml-predicting/}
}