Proximal Exploration for Model-Guided Protein Sequence Design
Abstract
Designing protein sequences with a particular biological function is a long-lasting challenge for protein engineering. Recent advances in machine-learning-guided approaches focus on building a surrogate sequence-function model to reduce the burden of expensive in-lab experiments. In this paper, we study the exploration mechanism of model-guided sequence design. We leverage a natural property of protein fitness landscape that a concise set of mutations upon the wild-type sequence are usually sufficient to enhance the desired function. By utilizing this property, we propose Proximal Exploration (PEX) algorithm that prioritizes the evolutionary search for high-fitness mutants with low mutation counts. In addition, we develop a specialized model architecture, called Mutation Factorization Network (MuFacNet), to predict low-order mutational effects, which further improves the sample efficiency of model-guided evolution. In experiments, we extensively evaluate our method on a suite of in-silico protein sequence design tasks and demonstrate substantial improvement over baseline algorithms.
Cite
Text
Ren et al. "Proximal Exploration for Model-Guided Protein Sequence Design." International Conference on Machine Learning, 2022.Markdown
[Ren et al. "Proximal Exploration for Model-Guided Protein Sequence Design." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/ren2022icml-proximal/)BibTeX
@inproceedings{ren2022icml-proximal,
title = {{Proximal Exploration for Model-Guided Protein Sequence Design}},
author = {Ren, Zhizhou and Li, Jiahan and Ding, Fan and Zhou, Yuan and Ma, Jianzhu and Peng, Jian},
booktitle = {International Conference on Machine Learning},
year = {2022},
pages = {18520-18536},
volume = {162},
url = {https://mlanthology.org/icml/2022/ren2022icml-proximal/}
}