Transformers Are Efficient Hierarchical Chemical Graph Learners
Abstract
Transformers, adapted from natural language processing, are emerging as a leading approach for graph representation learning. Current graph transformers generally treat each node or edge as an individual token, which can become computationally expensive for graphs of even moderate size owing to the quadratic scaling with token count of the computational complexity of self-attention. In this paper, we introduce SubFormer, a graph transformer that operates on subgraphs that aggregate information by a message-passing mechanism. This approach reduces the number of tokens and enhances learning long-range interactions. We demonstrate SubFormer on benchmarks for predicting molecular properties from chemical structures and show that it is competitive with state-of-the-art graph transformers at a fraction of the computational cost, with training times on the order of minutes on a consumer-grade graphics card. We interpret the attention weights in terms of chemical structures. We show that SubFormer exhibits limited over-smoothing and avoids over-squashing, which is prevalent in traditional graph neural networks.
Cite
Text
Pengmei et al. "Transformers Are Efficient Hierarchical Chemical Graph Learners." NeurIPS 2023 Workshops: AI4Science, 2023.Markdown
[Pengmei et al. "Transformers Are Efficient Hierarchical Chemical Graph Learners." NeurIPS 2023 Workshops: AI4Science, 2023.](https://mlanthology.org/neuripsw/2023/pengmei2023neuripsw-transformers/)BibTeX
@inproceedings{pengmei2023neuripsw-transformers,
title = {{Transformers Are Efficient Hierarchical Chemical Graph Learners}},
author = {Pengmei, Zihan and Li, Zimu and Tien, Chih-chan and Kondor, Risi and Dinner, Aaron},
booktitle = {NeurIPS 2023 Workshops: AI4Science},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/pengmei2023neuripsw-transformers/}
}