Elucidating the Design Space of Multimodal Protein Language Models
Abstract
Multimodal protein language models (PLMs) integrate sequence and token-based structural information, serving as a powerful foundation for protein modeling, generation, and design. However, the reliance on tokenizing 3D structures into discrete tokens causes substantial loss of fidelity about fine-grained structural details and correlations. In this paper, we systematically elucidate the design space of multimodal PLMs to overcome their limitations. We identify tokenization loss and inaccurate structure token predictions by the PLMs as major bottlenecks. To address these, our proposed design space covers improved generative modeling, structure-aware architectures and representation learning, and data exploration. Our advancements approach finer-grained supervision, demonstrating that token-based multimodal PLMs can achieve robust structural modeling. The effective design methods dramatically improve the structure generation diversity, and notably, folding abilities of our 650M model by reducing the RMSD from 5.52 to 2.36 on PDB testset, even outperforming 3B baselines and on par with the specialized folding models. Project page and code: https://bytedance.github.io/dplm/dplm-2.1.
Cite
Text
Hsieh et al. "Elucidating the Design Space of Multimodal Protein Language Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Hsieh et al. "Elucidating the Design Space of Multimodal Protein Language Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/hsieh2025icml-elucidating/)BibTeX
@inproceedings{hsieh2025icml-elucidating,
title = {{Elucidating the Design Space of Multimodal Protein Language Models}},
author = {Hsieh, Cheng-Yen and Wang, Xinyou and Zhang, Daiheng and Xue, Dongyu and Ye, Fei and Huang, Shujian and Zheng, Zaixiang and Gu, Quanquan},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {24156-24175},
volume = {267},
url = {https://mlanthology.org/icml/2025/hsieh2025icml-elucidating/}
}