FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction

Abstract

Compound identification from tandem mass spectrometry (MS/MS) data is a critical step in the analysis of complex mixtures. Typical solutions for the MS/MS spectrum to compound (MS2C) problem involve comparing the unknown spectrum against a library of known spectrum-molecule pairs, an approach that is limited by incomplete library coverage. Compound to MS/MS spectrum (C2MS) models can improve retrieval rates by augmenting real libraries with predicted MS/MS spectra. Unfortunately, many existing C2MS models suffer from problems with mass accuracy, generalization, or interpretability. We develop a new probabilistic method for C2MS prediction, FraGNNet, that can efficiently and accurately simulate MS/MS spectra with high mass accuracy. Our approach formulates the C2MS problem as learning a distribution over molecule fragments. FraGNNet achieves state-of-the-art performance in terms of prediction error and surpasses existing C2MS models as a tool for retrieval-based MS2C.

Cite

Text

Young et al. "FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction." Transactions on Machine Learning Research, 2025.

Markdown

[Young et al. "FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/young2025tmlr-fragnnet/)

BibTeX

@article{young2025tmlr-fragnnet,
  title     = {{FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction}},
  author    = {Young, Adamo and Wang, Fei and Wishart, David and Wang, Bo and Greiner, Russell and Rost, Hannes},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/young2025tmlr-fragnnet/}
}