Pure Transformers Are Powerful Graph Learners

Abstract

We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. Given a graph, we simply treat all nodes and edges as independent tokens, augment them with token embeddings, and feed them to a Transformer. With an appropriate choice of token embeddings, we prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers, which is already more expressive than all message-passing Graph Neural Networks (GNN). When trained on a large-scale graph dataset (PCQM4Mv2), our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results compared to Transformer variants with sophisticated graph-specific inductive bias. Our implementation is available at https://github.com/jw9730/tokengt.

Cite

Text

Kim et al. "Pure Transformers Are Powerful Graph Learners." Neural Information Processing Systems, 2022.

Markdown

[Kim et al. "Pure Transformers Are Powerful Graph Learners." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/kim2022neurips-pure/)

BibTeX

@inproceedings{kim2022neurips-pure,
  title     = {{Pure Transformers Are Powerful Graph Learners}},
  author    = {Kim, Jinwoo and Nguyen, Dat and Min, Seonwoo and Cho, Sungjun and Lee, Moontae and Lee, Honglak and Hong, Seunghoon},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/kim2022neurips-pure/}
}