Scalable and Cost-Efficient De Novo Template-Based Molecular Generation
Abstract
Template-based molecular generation offers a promising avenue for drug design by ensuring generated compounds are synthetically accessible through predefined reaction templates and building blocks. In this work, we tackle three core challenges in template-based GFlowNets: (1) minimizing synthesis cost, (2) scaling to large building block libraries, and (3) effectively utilizing small fragment sets. We propose **Recursive Cost Guidance**, a backward policy framework that employs auxiliary machine learning models to approximate synthesis cost and viability. This guidance steers generation toward low-cost synthesis pathways, significantly enhancing cost-efficiency, molecular diversity, and quality, especially when paired with an **Exploitation Penalty** that balances the trade-off between exploration and exploitation. To enhance performance in smaller building block libraries, we develop a **Dynamic Library** mechanism that reuses intermediate high-reward states to construct full synthesis trees. Our approach establishes state-of-the-art results in template-based molecular generation.
Cite
Text
Gaiński et al. "Scalable and Cost-Efficient De Novo Template-Based Molecular Generation." Advances in Neural Information Processing Systems, 2025.Markdown
[Gaiński et al. "Scalable and Cost-Efficient De Novo Template-Based Molecular Generation." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/gainski2025neurips-scalable/)BibTeX
@inproceedings{gainski2025neurips-scalable,
title = {{Scalable and Cost-Efficient De Novo Template-Based Molecular Generation}},
author = {Gaiński, Piotr and Boussif, Oussama and Rekesh, Andrei and Shevchuk, Dmytro and Parviz, Ali and Tyers, Mike and Batey, Robert A. and Koziarski, Michał},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/gainski2025neurips-scalable/}
}