Efficiently Predicting High Resolution Mass Spectra with Graph Neural Networks
Abstract
Identifying a small molecule from its mass spectrum is the primary open problem in computational metabolomics. This is typically cast as information retrieval: an unknown spectrum is matched against spectra predicted computationally from a large database of chemical structures. However, current approaches to spectrum prediction model the output space in ways that force a tradeoff between capturing high resolution mass information and tractable learning. We resolve this tradeoff by casting spectrum prediction as a mapping from an input molecular graph to a probability distribution over chemical formulas. We further discover that a large corpus of mass spectra can be closely approximated using a fixed vocabulary constituting only 2% of all observed formulas. This enables efficient spectrum prediction using an architecture similar to graph classification - GrAFF-MS - achieving significantly lower prediction error and greater retrieval accuracy than previous approaches.
Cite
Text
Murphy et al. "Efficiently Predicting High Resolution Mass Spectra with Graph Neural Networks." International Conference on Machine Learning, 2023.Markdown
[Murphy et al. "Efficiently Predicting High Resolution Mass Spectra with Graph Neural Networks." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/murphy2023icml-efficiently/)BibTeX
@inproceedings{murphy2023icml-efficiently,
title = {{Efficiently Predicting High Resolution Mass Spectra with Graph Neural Networks}},
author = {Murphy, Michael and Jegelka, Stefanie and Fraenkel, Ernest and Kind, Tobias and Healey, David and Butler, Thomas},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {25549-25562},
volume = {202},
url = {https://mlanthology.org/icml/2023/murphy2023icml-efficiently/}
}