RetroGFN: Diverse and Feasible Retrosynthesis Using GFlowNets

Abstract

Single-step retrosynthesis aims to predict a set of reactions that lead to the creation of a target molecule and is a crucial task in molecular discovery. Although a target molecule can often be synthesized with multiple different reactions, it is not clear how to verify the feasibility of a reaction, because the available datasets cover only a tiny fraction of the possible solutions. Consequently, the existing models are not encouraged to explore the space of possible reactions sufficiently. To resolve these issues, we first propose a Feasibility Thresholded Count (FTC) metric that estimates the reaction feasibility with a machine-learning model. Second, we develop a novel retrosynthesis model, RetroGFN, which can explore outside the limited dataset and return a diverse set of feasible reactions. We show that RetroGFN outperforms existing methods on the FTC metric by a large margin while maintaining competitive results on the widely used top-k accuracy metric.

Cite

Text

Gaiński et al. "RetroGFN: Diverse and Feasible Retrosynthesis Using GFlowNets." ICLR 2024 Workshops: GEM, 2024.

Markdown

[Gaiński et al. "RetroGFN: Diverse and Feasible Retrosynthesis Using GFlowNets." ICLR 2024 Workshops: GEM, 2024.](https://mlanthology.org/iclrw/2024/gainski2024iclrw-retrogfn/)

BibTeX

@inproceedings{gainski2024iclrw-retrogfn,
  title     = {{RetroGFN: Diverse and Feasible Retrosynthesis Using GFlowNets}},
  author    = {Gaiński, Piotr and Koziarski, Michał and Maziarz, Krzysztof and Segler, Marwin and Tabor, Jacek and Śmieja, Marek},
  booktitle = {ICLR 2024 Workshops: GEM},
  year      = {2024},
  url       = {https://mlanthology.org/iclrw/2024/gainski2024iclrw-retrogfn/}
}