FlowBack: A Flow-Matching Approach for Generative Backmapping of Macromolecules

Abstract

Coarse-grained models have become ubiquitous in biomolecular modeling tasks aimed at studying slow dynamical processes such as protein folding and DNA hybridization. Although these models considerably accelerate sampling, it remains challenging to recover an ensemble of all-atom structures corresponding to coarse-grained simulations. In this work, we introduce a generative approach called FlowBack that uses a flow-matching objective to map samples from a coarse-grained prior distribution to an all-atom data distribution. We construct our prior distribution to be amenable to any coarse-grained map and any type of macromolecule, and we find that generated structures are more robust and contain less steric clashes than those generated by previous approaches. We train a protein-specific model on structures from the Protein Data Bank which achieve state-of-the-art results on bond quality on clash score. Furthermore, we train a model on DNA-protein data which achieves excellent reconstruction and generative capabilities on complexes from the PDB as well as on coarse-grained simulations of DNA-protein binding.

Cite

Text

Jones et al. "FlowBack: A Flow-Matching Approach for Generative Backmapping of Macromolecules." ICML 2024 Workshops: ML4LMS, 2024.

Markdown

[Jones et al. "FlowBack: A Flow-Matching Approach for Generative Backmapping of Macromolecules." ICML 2024 Workshops: ML4LMS, 2024.](https://mlanthology.org/icmlw/2024/jones2024icmlw-flowback/)

BibTeX

@inproceedings{jones2024icmlw-flowback,
  title     = {{FlowBack: A Flow-Matching Approach for Generative Backmapping of Macromolecules}},
  author    = {Jones, Michael and Khanna, Smayan and Ferguson, Andrew},
  booktitle = {ICML 2024 Workshops: ML4LMS},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/jones2024icmlw-flowback/}
}