Controllable Protein Sequence Generation with LLM Preference Optimization

Abstract

Designing proteins with specific attributes offers an important solution to address biomedical challenges. Pre-trained protein large language models (LLMs) have shown promising results on protein sequence generation. However, to control sequence generation for specific attributes, existing work still exhibits poor functionality and structural stability. In this paper, we propose a novel controllable protein design method called CtrlProt. We finetune a protein LLM with a new multi-listwise preference optimization strategy to improve generation quality and support multi-attribute controllable generation. Experiments demonstrate that CtrlProt can meet functionality and structural stability requirements effectively, achieving state-of-the-art performance in both single-attribute and multi-attribute protein sequence generation.

Cite

Text

Liu et al. "Controllable Protein Sequence Generation with LLM Preference Optimization." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I1.32030

Markdown

[Liu et al. "Controllable Protein Sequence Generation with LLM Preference Optimization." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/liu2025aaai-controllable/) doi:10.1609/AAAI.V39I1.32030

BibTeX

@inproceedings{liu2025aaai-controllable,
  title     = {{Controllable Protein Sequence Generation with LLM Preference Optimization}},
  author    = {Liu, Xiangyu and Liu, Yi and Chen, Silei and Hu, Wei},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {505-513},
  doi       = {10.1609/AAAI.V39I1.32030},
  url       = {https://mlanthology.org/aaai/2025/liu2025aaai-controllable/}
}