Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs

El, Batu; Choudhury, Deepro; Lio, Pietro; Joshi, Chaitanya K.

doi:10.48550/arxiv.2502.12352

Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs

Batu El, Deepro Choudhury, Pietro Lio, Chaitanya K. Joshi

ICLRW 2025

doi:10.48550/arxiv.2502.12352 /iclrw/2025/el2025iclrw-mechanistic/

Abstract

We introduce Attention Graphs, a new tool for mechanistic interpretability of Graph Neural Networks (GNNs) and Graph Transformers based on the mathematical equivalence between message passing in GNNs and the self-attention mechanism in Transformers. Attention Graphs aggregate attention matrices across Transformer layers and heads to describe how information flows among input nodes. Through experiments on homophilous and heterophilous node classification tasks, we analyze Attention Graphs from a network science perspective and find that: (1) When Graph Transformers are allowed to learn the optimal graph structure using all-to-all attention among input nodes, the Attention Graphs learned by the model do not tend to correlate with the input/original graph structure; and (2) For heterophilous graphs, different Graph Transformer variants can achieve similar performance while utilising distinct information flow patterns. Open source code: https://github.com/batu-el/understanding-inductive-biases-of-gnns

PDF ICLRW OpenReview Semantic Scholar

Cite

Text

El et al. "Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs." ICLR 2025 Workshops: XAI4Science, 2025. doi:10.48550/arxiv.2502.12352

Markdown

[El et al. "Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs." ICLR 2025 Workshops: XAI4Science, 2025.](https://mlanthology.org/iclrw/2025/el2025iclrw-mechanistic/) doi:10.48550/arxiv.2502.12352

BibTeX

@inproceedings{el2025iclrw-mechanistic,
  title     = {{Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs}},
  author    = {El, Batu and Choudhury, Deepro and Lio, Pietro and Joshi, Chaitanya K.},
  booktitle = {ICLR 2025 Workshops: XAI4Science},
  year      = {2025},
  doi       = {10.48550/arxiv.2502.12352},
  url       = {https://mlanthology.org/iclrw/2025/el2025iclrw-mechanistic/}
}