Enhancing Protein Mutation Effect Prediction Through a Retrieval-Augmented Framework

Abstract

Predicting the effects of protein mutations is crucial for analyzing protein functions and understanding genetic diseases. However, existing models struggle to effectively extract mutation-related local structure motifs from protein databases, which hinders their predictive accuracy and robustness. To tackle this problem, we design a novel retrieval-augmented framework for incorporating similar structure information in known protein structures. We create a vector database consisting of local structure motif embeddings from a pre-trained protein structure encoder, which allows for efficient retrieval of similar local structure motifs during mutation effect prediction. Our findings demonstrate that leveraging this method results in the SOTA performance across multiple protein mutation prediction datasets, and offers a scalable solution for studying mutation effects.

Cite

Text

Guo et al. "Enhancing Protein Mutation Effect Prediction Through a Retrieval-Augmented Framework." Neural Information Processing Systems, 2024. doi:10.52202/079017-1556

Markdown

[Guo et al. "Enhancing Protein Mutation Effect Prediction Through a Retrieval-Augmented Framework." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/guo2024neurips-enhancing/) doi:10.52202/079017-1556

BibTeX

@inproceedings{guo2024neurips-enhancing,
  title     = {{Enhancing Protein Mutation Effect Prediction Through a Retrieval-Augmented Framework}},
  author    = {Guo, Ruihan and Wang, Rui and Wu, Ruidong and Ren, Zhizhou and Li, Jiahan and Luo, Shitong and Wu, Zuofan and Liu, Qiang and Peng, Jian and Ma, Jianzhu},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-1556},
  url       = {https://mlanthology.org/neurips/2024/guo2024neurips-enhancing/}
}