Towards Protein Sequence & Structure Co-Design with Multi-Modal Language Models
Abstract
Proteins perform diverse biological functions, governed by the intricate relationship between their sequence and three-dimensional structure. While protein language models (PLMs) have demonstrated remarkable success in functional annotation and structure prediction, their potential for sequence-structure co-design remains underexplored. This limitation arises from pre-training objectives that favor masked token prediction over generative modeling. In this work, we systematically explore sampling strategies to enhance the generative capabilities of PLMs for co-design. Notably, we introduce a ranked iterative decoding with re-masking scheme, enabling PLMs to generate sequences and structures more effectively. Benchmarking ESM3 across multiple scales, we demonstrate that using PLMs effectively at sampling time for co-design tasks can outperform specialized architectures that lack comparable scaling properties. Our work advances the field of computational protein design by equipping PLMs with robust generative capabilities tailored to sequence-structure interdependence.
Cite
Text
Lu et al. "Towards Protein Sequence & Structure Co-Design with Multi-Modal Language Models." ICLR 2025 Workshops: GEM, 2025.Markdown
[Lu et al. "Towards Protein Sequence & Structure Co-Design with Multi-Modal Language Models." ICLR 2025 Workshops: GEM, 2025.](https://mlanthology.org/iclrw/2025/lu2025iclrw-protein/)BibTeX
@inproceedings{lu2025iclrw-protein,
title = {{Towards Protein Sequence & Structure Co-Design with Multi-Modal Language Models}},
author = {Lu, Stephen Zhewen and Lu, Jiarui and Guo, Hongyu and Tang, Jian},
booktitle = {ICLR 2025 Workshops: GEM},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/lu2025iclrw-protein/}
}