GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning

Abstract

Glycans are basic biomolecules and perform essential functions within living organisms. The rapid increase of functional glycan data provides a good opportunity for machine learning solutions to glycan understanding. However, there still lacks a standard machine learning benchmark for glycan property and function prediction. In this work, we fill this blank by building a comprehensive benchmark for Glycan Machine Learning (GlycanML). The GlycanML benchmark consists of diverse types of tasks including glycan taxonomy prediction, glycan immunogenicity prediction, glycosylation type prediction, and protein-glycan interaction prediction. Glycans can be represented by both sequences and graphs in GlycanML, which enables us to extensively evaluate sequence-based models and graph neural networks (GNNs) on benchmark tasks. Furthermore, by concurrently performing eight glycan taxonomy prediction tasks, we introduce the GlycanML-MTL testbed for multi-task learning (MTL) algorithms. Also, we evaluate how taxonomy prediction can boost other three function prediction tasks by MTL. Experimental results show the superiority of modeling glycans with multi-relational GNNs, and suitable MTL methods can further boost model performance. We provide all datasets and source codes at https://github.com/GlycanML/GlycanML and maintain a leaderboard at https://GlycanML.github.io/project

Cite

Text

Xu et al. "GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning." International Conference on Learning Representations, 2025.

Markdown

[Xu et al. "GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/xu2025iclr-glycanml/)

BibTeX

@inproceedings{xu2025iclr-glycanml,
  title     = {{GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning}},
  author    = {Xu, Minghao and Geng, Yunteng and Zhang, Yihang and Yang, Ling and Tang, Jian and Zhang, Wentao},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/xu2025iclr-glycanml/}
}