ASPred: Identification of Antigen Specific B-Cell Receptors from Single V(D)J Sequences Using Large Language Models
Abstract
The rapid sequencing of antibody genes has accelerated vaccine development. However, predicting synthetic antibodies capable of binding and neutralizing novel antigens remains challenging due to a limited understanding of the rules of protein-protein interaction at the surface of an antigen to which its cognate antibody protein binds. While recent advances in single-cell sequencing of antibody-producing B-cells sequences have improved precision in mapping B-cell receptors (or BCRs, which are the membrane-bound forms of the antibodies) to their cognate antigens, there remain additional challenges. We have developed a computational strategy, the Antibody Specificity Predictor (ASPred), with which we have trained two Large Language Models (LLMs) with known sequences of antigen-BCR pairs to predict antigen-specific BCRs from the total BCR repertoire of immunized mice. By leveraging pattern recognition capabilities of LLMs we successfully classify novel B-cell receptors with a challenge antigen not represented in the training set, without the need for preselecting the B cells by antigen binding. The properties of the top 10 predicted candidates were validated by coarse-grained molecular dynamics simulations. These results suggest that sufficient information exists in BCR-antigen sequence pairs for LLMs to reliably predict antigen-antibody interaction specificity, potentially opening new avenues for the computational design of synthetic antibodies for vaccine and therapeutic development.
Cite
Text
Paco et al. "ASPred: Identification of Antigen Specific B-Cell Receptors from Single V(D)J Sequences Using Large Language Models." NeurIPS 2024 Workshops: LXAI, 2024.Markdown
[Paco et al. "ASPred: Identification of Antigen Specific B-Cell Receptors from Single V(D)J Sequences Using Large Language Models." NeurIPS 2024 Workshops: LXAI, 2024.](https://mlanthology.org/neuripsw/2024/paco2024neuripsw-aspred/)BibTeX
@inproceedings{paco2024neuripsw-aspred,
title = {{ASPred: Identification of Antigen Specific B-Cell Receptors from Single V(D)J Sequences Using Large Language Models}},
author = {Paco, Karen and Paco, Mariana and Zhang, Zihao and Condori, Isabel and Zebardast, Sanaz and Olatoyinbo, Peace and Patel, Dhruv and Felix, Jonathan and Yang, Tristan and Lay, Jordan and Tolstorukov, Ilya and Le Roch, Karine and Sazinsky, Matthew and Hernandez, Jeniffer and Lonardi, Stefano and Ray, Animesh},
booktitle = {NeurIPS 2024 Workshops: LXAI},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/paco2024neuripsw-aspred/}
}