Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

Abstract

Generative models have the potential to accelerate key steps in the discovery of novel molecular therapeutics and materials. Diffusion models have recently emerged as a powerful approach, excelling at unconditional sample generation and, with data-driven guidance, conditional generation within their training domain. Reliably sampling from high-value regions beyond the training data, however, remains an open challenge—with current methods predominantly focusing on modifying the diffusion process itself. In this paper, we develop context-guided diffusion (CGD), a simple plug-and-play method that leverages unlabeled data and smoothness constraints to improve the out-of-distribution generalization of guided diffusion models. We demonstrate that this approach leads to substantial performance gains across various settings, including continuous, discrete, and graph-structured diffusion processes with applications across drug discovery, materials science, and protein design.

Cite

Text

Klarner et al. "Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design." International Conference on Machine Learning, 2024.

Markdown

[Klarner et al. "Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/klarner2024icml-contextguided/)

BibTeX

@inproceedings{klarner2024icml-contextguided,
  title     = {{Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design}},
  author    = {Klarner, Leo and Rudner, Tim G. J. and Morris, Garrett M and Deane, Charlotte and Teh, Yee Whye},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {24770-24807},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/klarner2024icml-contextguided/}
}