Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding Problem

Abstract

Construction of a scaffold structure that supports a desired motif, conferring protein function, shows promise for the design of vaccines and enzymes. But a general solution to this motif-scaffolding problem remains open. Current machine-learning techniques for scaffold design are either limited to unrealistically small scaffolds (up to length 20) or struggle to produce multiple diverse scaffolds. We propose to learn a distribution over diverse and longer protein backbone structures via an E(3)-equivariant graph neural network. We develop SMCDiff to efficiently sample scaffolds from this distribution conditioned on a given motif; our algorithm is the first to theoretically guarantee conditional samples from a diffusion model in the large-compute limit. We evaluate our designed backbones by how well they align with AlphaFold2-predicted structures. We show that our method can (1) sample scaffolds up to 80 residues and (2) achieve structurally diverse scaffolds for a fixed motif.

Cite

Text

Trippe et al. "Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding Problem." International Conference on Learning Representations, 2023.

Markdown

[Trippe et al. "Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding Problem." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/trippe2023iclr-diffusion/)

BibTeX

@inproceedings{trippe2023iclr-diffusion,
  title     = {{Diffusion Probabilistic Modeling of Protein Backbones in 3D for the Motif-Scaffolding Problem}},
  author    = {Trippe, Brian L. and Yim, Jason and Tischer, Doug and Baker, David and Broderick, Tamara and Barzilay, Regina and Jaakkola, Tommi S.},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/trippe2023iclr-diffusion/}
}