Fine-Tuning the ESM2 Protein Language Model to Understand the Functional Impact of Missense Variants
Abstract
Elucidating the functional effect of missense variants is of crucial importance, yet challenging. To understand the impact of such variants, we fine-tuned the ESM2 protein language model to classify 20 protein features at amino acid resolution. We used the resulting models to: 1) identify protein features that are enriched in either pathogenic or benign missense variants, 2) compare the characteristics of proteins with reference or alternate alleles to understand how missense variants affect protein functionality. We show that our model can be used to reclassify some variants of unknown significance. We also demonstrate the usage of our models for understanding the potential effect of variants on protein features.
Cite
Text
Saadat and Fellay. "Fine-Tuning the ESM2 Protein Language Model to Understand the Functional Impact of Missense Variants." ICML 2024 Workshops: AccMLBio, 2024.Markdown
[Saadat and Fellay. "Fine-Tuning the ESM2 Protein Language Model to Understand the Functional Impact of Missense Variants." ICML 2024 Workshops: AccMLBio, 2024.](https://mlanthology.org/icmlw/2024/saadat2024icmlw-finetuning/)BibTeX
@inproceedings{saadat2024icmlw-finetuning,
title = {{Fine-Tuning the ESM2 Protein Language Model to Understand the Functional Impact of Missense Variants}},
author = {Saadat, Ali and Fellay, Jacques},
booktitle = {ICML 2024 Workshops: AccMLBio},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/saadat2024icmlw-finetuning/}
}