Understanding Energy-Based Modeling of Proteins via an Empirically Motivated Minimal Ground Truth Model
Abstract
Energy-based models (EBM) of sequences of evolutionarily related families of proteins have the ability to learn the generic constraints necessary to make novel functional sequences, which have been validated by \textit{in vivo} experiments. However, these learned energy functions require re-scaling by a temperature parameter in order to sample novel functional sequences. Here, we generate data from a minimal model motivated by a wide array of empirical evidence for a synergistic cluster of amino acids, or sector, within a sequence. We find our setting captures salient learning behaviors similar to those exhibited by EBMs fitted to real proteins, namely the necessity for temperature tuning to increase generative performance. We discuss how this guides insight into the functional sequence space of proteins.
Cite
Text
Fields et al. "Understanding Energy-Based Modeling of Proteins via an Empirically Motivated Minimal Ground Truth Model." ICML 2023 Workshops: SynS_and_ML, 2023.Markdown
[Fields et al. "Understanding Energy-Based Modeling of Proteins via an Empirically Motivated Minimal Ground Truth Model." ICML 2023 Workshops: SynS_and_ML, 2023.](https://mlanthology.org/icmlw/2023/fields2023icmlw-understanding/)BibTeX
@inproceedings{fields2023icmlw-understanding,
title = {{Understanding Energy-Based Modeling of Proteins via an Empirically Motivated Minimal Ground Truth Model}},
author = {Fields, Peter William and Ngampruetikorn, Vudtiwat and Ranganathan, Rama and Schwab, David J. and Palmer, Stephanie},
booktitle = {ICML 2023 Workshops: SynS_and_ML},
year = {2023},
url = {https://mlanthology.org/icmlw/2023/fields2023icmlw-understanding/}
}