A Graph Is Worth $k$ Words: Euclideanizing Graph Using Pure Transformer

Abstract

Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on $100$M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on $8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at https://github.com/A4Bio/GraphsGPT.

Cite

Text

Gao et al. "A Graph Is Worth $k$ Words: Euclideanizing Graph Using Pure Transformer." International Conference on Machine Learning, 2024.

Markdown

[Gao et al. "A Graph Is Worth $k$ Words: Euclideanizing Graph Using Pure Transformer." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/gao2024icml-graph/)

BibTeX

@inproceedings{gao2024icml-graph,
  title     = {{A Graph Is Worth $k$ Words: Euclideanizing Graph Using Pure Transformer}},
  author    = {Gao, Zhangyang and Dong, Daize and Tan, Cheng and Xia, Jun and Hu, Bozhen and Li, Stan Z.},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {14681-14701},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/gao2024icml-graph/}
}