Learning to Predict Ensembles of Protein Conformations from Molecular Dynamics Simulation Trajectories

Abstract

A group of heterogeneous conformations of a protein, also known as an ensemble of conformations, is a key to understanding protein functions. This is because many proteins are mechanical machines that perform tasks by changing their shapes. Nevertheless, the main focus of protein structure prediction from a sequence thus far has been to accurately predict a single structure, e.g., AlphaFold (AF) [Abramson et al. (2024)] and ESMFold [Lin et al. (2023)]. Recently, works on predicting multiple conformations by subsampling MSAs (multiple sequence alignments) [del Alamo et al. (2022)] or by clustering MSAs [Wayment-Steele et al. (2024)] were introduced. While they can predict heterogeneous conformations, they are limited w.r.t. the diversity of predicted struc- tures as well as the trainability on data other than Protein Data Bank (PDB) [Berman et al. (2000)] structures, such as on molecular dynamics (MD) simulation trajectories. AlphaFlow [Jing et al. (2024)] overcame this limitation by incorporating a Flow Matching (FM) [Lipman et al. (2023)] framework with AlphaFold as a denoising model. Since an FM model can generate diverse samples by transforming the initial samples from a prior distribution, AlphaFlow has a potential to generate ensembles of conformations. The authors showed that it can be trained on MD trajectories and gen- erate physically feasible ensembles. In this paper, we look more closely into AlphaFlow’s ability on learning MD ensembles that are generated using Temperature Replica Exchange Molecular Dynam- ics (T-REMD) [Qi et al. (2018)]. This is an exploratory study before improving its architecture for proposing our own model.

Cite

Text

Koo et al. "Learning to Predict Ensembles of Protein Conformations from Molecular Dynamics Simulation Trajectories." ICLR 2025 Workshops: LMRL, 2025.

Markdown

[Koo et al. "Learning to Predict Ensembles of Protein Conformations from Molecular Dynamics Simulation Trajectories." ICLR 2025 Workshops: LMRL, 2025.](https://mlanthology.org/iclrw/2025/koo2025iclrw-learning/)

BibTeX

@inproceedings{koo2025iclrw-learning,
  title     = {{Learning to Predict Ensembles of Protein Conformations from Molecular Dynamics Simulation Trajectories}},
  author    = {Koo, Bongjin and Jiang, Patrick and Dutta, Soumya and Kazan, I. Can and Ozkan, S. Banu and Kim, Paul T and Singharoy, Abhishek and Bepler, Tristan},
  booktitle = {ICLR 2025 Workshops: LMRL},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/koo2025iclrw-learning/}
}