Graph Diffusion Transformers Are In-Context Molecular Designers
Abstract
In-context learning lets large models adapt to new tasks from a few demonstrations, but it has shown limited success in molecular design, where labeled data are scarce and properties span millions of biological assays and material measurements. We introduce demonstration-conditioned diffusion models (DemoDiff), which define task contexts through molecule–score examples instead of texts. These demonstrations guide a denoising Transformer to generate molecules aligned with target properties. For scalable pretraining, we develop a new molecular tokenizer with Node Pair Encoding that represents molecules at the motif level, requiring 5.5$\times$ fewer nodes. We pretrain a 0.7B parameter model on datasets covering drugs and materials. Across 33 design tasks in six categories, DemoDiff matches or surpasses language models 100–1000$\times$ larger and achieves an average rank of 4.10 compared to 6.56–17.95 for 19 baselines. These results position DemoDiff as a molecular foundation model for in-context molecular design.
Cite
Text
Liu et al. "Graph Diffusion Transformers Are In-Context Molecular Designers." International Conference on Learning Representations, 2026.Markdown
[Liu et al. "Graph Diffusion Transformers Are In-Context Molecular Designers." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/liu2026iclr-graph/)BibTeX
@inproceedings{liu2026iclr-graph,
title = {{Graph Diffusion Transformers Are In-Context Molecular Designers}},
author = {Liu, Gang and Chen, Jie and Zhu, Yihan and Sun, Michael and Luo, Tengfei and Chawla, Nitesh V and Jiang, Meng},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/liu2026iclr-graph/}
}