PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design
Abstract
Designing protein-binding proteins with high affinity is critical in biomedical research and biotechnology. Despite recent advancements targeting specific proteins, the ability to create high-affinity binders for arbitrary protein targets on demand, without extensive rounds of wet-lab testing, remains a significant challenge. Here, we introduce PPDiff, a diffusion model to jointly design the sequence and structure of binders for arbitrary protein targets in a non-autoregressive manner. PPDiff builds upon our developed Sequence Structure Interleaving Network with Causal attention layers (SSINC), which integrates interleaved self-attention layers to capture global amino acid correlations, $k$-nearest neighbor ($k$NN) equivariant graph convolutional layers to model local interactions in three-dimensional (3D) space, and causal attention layers to simplify the intricate interdependencies within the protein sequence. To assess PPDiff, we curate PPBench, a general protein-protein complex dataset comprising 706,360 complexes from the Protein Data Bank (PDB). The model is pretrained on PPBench and finetuned on two real-world applications: target-protein mini-binder complex design and antigen-antibody complex design. PPDiff consistently surpasses baseline methods, achieving success rates of 50.00%, 23.16%, and 16.89% for the pretraining task and the two downstream applications, respectively.
Cite
Text
Song et al. "PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Song et al. "PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/song2025icml-ppdiff/)BibTeX
@inproceedings{song2025icml-ppdiff,
title = {{PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design}},
author = {Song, Zhenqiao and Li, Tianxiao and Li, Lei and Min, Martin Renqiang},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {56319-56336},
volume = {267},
url = {https://mlanthology.org/icml/2025/song2025icml-ppdiff/}
}