Sampling Protein Language Models for Functional Protein Design
Abstract
Protein language models have emerged as powerful tools for learning rich representations of proteins, enhancing performance across various downstream tasks such as structure prediction, mutation effects prediction, and homology detection. Their ability to learn complex distributions over protein sequences also shows significant potential for designing novel and functional proteins, with broad applications in therapeutics, new materials, and sustainability. Given the vastness of the protein sequence space, efficient exploration methods are critical to the success of protein engineering efforts. However, the methodologies for effectively sampling from these models to achieve core protein design objectives remain underexplored and have predominantly relied on techniques initially developed for Natural Language Processing tasks. In this work, we first develop a comprehensive *in silico* protein design evaluation framework to systematically compare different sampling methods. After a thorough review of existing sampling strategies for language models, we introduce several approaches specifically tailored for protein design. We then evaluate these strategies using our *in silico* benchmark, investigating the effects of key hyperparameters and providing practical guidance on the relative strengths of each method depending on design objectives.
Cite
Text
Darmawan et al. "Sampling Protein Language Models for Functional Protein Design." ICLR 2025 Workshops: GEM, 2025.Markdown
[Darmawan et al. "Sampling Protein Language Models for Functional Protein Design." ICLR 2025 Workshops: GEM, 2025.](https://mlanthology.org/iclrw/2025/darmawan2025iclrw-sampling/)BibTeX
@inproceedings{darmawan2025iclrw-sampling,
title = {{Sampling Protein Language Models for Functional Protein Design}},
author = {Darmawan, Jeremie Theddy and Gal, Yarin and Notin, Pascal},
booktitle = {ICLR 2025 Workshops: GEM},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/darmawan2025iclrw-sampling/}
}