GenMol: A Drug Discovery Generalist with Discrete Diffusion

Abstract

Drug discovery is a complex process that involves multiple stages and tasks. However, existing molecular generative models can only tackle some of these tasks. We present Generalist Molecular generative model (GenMol), a versatile framework that uses only a single discrete diffusion model to handle diverse drug discovery scenarios. GenMol generates Sequential Attachment-based Fragment Embedding (SAFE) sequences through non-autoregressive bidirectional parallel decoding, thereby allowing the utilization of a molecular context that does not rely on the specific token ordering while having better sampling efficiency. GenMol uses fragments as basic building blocks for molecules and introduces fragment remasking, a strategy that optimizes molecules by regenerating masked fragments, enabling effective exploration of chemical space. We further propose molecular context guidance (MCG), a guidance method tailored for masked discrete diffusion of GenMol. GenMol significantly outperforms the previous GPT-based model in de novo generation and fragment-constrained generation, and achieves state-of-the-art performance in goal-directed hit generation and lead optimization. These results demonstrate that GenMol can tackle a wide range of drug discovery tasks, providing a unified and versatile approach for molecular design.

Cite

Text

Lee et al. "GenMol: A Drug Discovery Generalist with Discrete Diffusion." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Lee et al. "GenMol: A Drug Discovery Generalist with Discrete Diffusion." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/lee2025icml-genmol/)

BibTeX

@inproceedings{lee2025icml-genmol,
  title     = {{GenMol: A Drug Discovery Generalist with Discrete Diffusion}},
  author    = {Lee, Seul and Kreis, Karsten and Veccham, Srimukh Prasad and Liu, Meng and Reidenbach, Danny and Peng, Yuxing and Paliwal, Saee Gopal and Nie, Weili and Vahdat, Arash},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {33205-33226},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/lee2025icml-genmol/}
}