Scalable and Cost-Efficient De Novo Template-Based Molecular Generation

Abstract

Recent advances in reaction-based molecular generation hold great promise for drug design. Composing a molecule from a predefined set of reaction templates and building blocks keeps the generative modeling in line with what can be synthesized in a real-world wet lab. In this paper, we tackle three important challenges of template-based GFlowNets: 1) reducing the synthesis cost, 2) navigating in a large set of building blocks, and 3) exploiting a small set of building blocks. We propose Cost Guidance for a backward policy that uses an auxiliary machine-learning model to approximate the synthesis cost. Our approach limits the costs of proposed molecules, while drastically improving their diversity and quality in large-scale settings. Moreover, we design a Dynamic Library mechanism that allows the generation of full synthesis trees, boosting the results in small-scale settings. The resulting generative model establishes state-of-the-art results in template-based molecular generation in a benchmark concerning synthesis cost and diversity of high-rewarded molecules.

Cite

Text

Gaiński et al. "Scalable and Cost-Efficient De Novo Template-Based Molecular Generation." ICLR 2025 Workshops: GEM, 2025.

Markdown

[Gaiński et al. "Scalable and Cost-Efficient De Novo Template-Based Molecular Generation." ICLR 2025 Workshops: GEM, 2025.](https://mlanthology.org/iclrw/2025/gainski2025iclrw-scalable/)

BibTeX

@inproceedings{gainski2025iclrw-scalable,
  title     = {{Scalable and Cost-Efficient De Novo Template-Based Molecular Generation}},
  author    = {Gaiński, Piotr and Boussif, Oussama and Shevchuk, Dmytro and Rekesh, Andrei and Parviz, Ali and Tyers, Mike and Batey, Robert A. and Koziarski, Michał},
  booktitle = {ICLR 2025 Workshops: GEM},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/gainski2025iclrw-scalable/}
}