Improving Protein-Peptide Interface Predic- Tions in the Low Data Regime
Abstract
We propose a novel approach for predicting protein-peptide interactions using a bi-modal transformer architecture that learns an inter-facial joint distribution of residual contacts. The current data sets for crystallized protein-peptide complexes are limited, making it difficult to accurately predict interactions between proteins and peptides. To address this issue, we propose augmenting the existing data from PepBDB with pseudo protein-peptide complexes derived from the PDB. The augmented data set acts as a method to transfer physics-based context-dependent intra-residue (within a domain) interactions to the inter-residual (between) domains. We show that the distributions of inter-facial residue-residue interactions share overlap with inter residue-residue interactions, enough to increase predictive power of our bi-modal transformer architecture. In addition, this data-augmentation allows us to leverage the vast amount of protein-only data available in the PDB to train neural networks, in contrast to template-based modeling that acts as a prior.
Cite
Text
Diamond and Lill. "Improving Protein-Peptide Interface Predic- Tions in the Low Data Regime." ICLR 2023 Workshops: MLDD, 2023.Markdown
[Diamond and Lill. "Improving Protein-Peptide Interface Predic- Tions in the Low Data Regime." ICLR 2023 Workshops: MLDD, 2023.](https://mlanthology.org/iclrw/2023/diamond2023iclrw-improving/)BibTeX
@inproceedings{diamond2023iclrw-improving,
title = {{Improving Protein-Peptide Interface Predic- Tions in the Low Data Regime}},
author = {Diamond, Justin and Lill, Markus Alexander},
booktitle = {ICLR 2023 Workshops: MLDD},
year = {2023},
url = {https://mlanthology.org/iclrw/2023/diamond2023iclrw-improving/}
}