<SO$G_k$>: One LLM Token for Explicit Graph Structural Understanding

Abstract

Large language models show great potential in unstructured data understanding, but still face significant challenges with graphs due to their structural hallucination. Existing approaches mainly either verbalize graphs into natural language, which leads to excessive token consumption and scattered attention, or transform graphs into trainable continuous embeddings (i.e., soft prompt), but exhibit severe misalignment with original text tokens. To solve this problem, we propose to incorporate one special token <SO$G_k$> to fully represent the \textbf{\underline{S}}tructure \textbf{\underline{O}}f \textbf{\underline{G}}raph within a unified token space, facilitating explicit topology input and structural information sharing. Specifically, we propose a topology-aware structural tokenizer that maps each graph topology into a highly selective single token. Afterwards, we construct a set of hybrid structure Question-Answering corpora to align new structural tokens with existing text tokens. With this approach, <SO$G_k$> empowers LLMs to understand, generate, and reason in a concise and accurate manner. Extensive experiments on five graph-level benchmarks demonstrate the superiority of our method, achieving a performance improvement of 9.9–41.4\% compared to the baselines while exhibiting interpretability and consistency. Furthermore, our method provides a flexible extension to node-level tasks, enabling both global and local structural understanding. The codebase is publicly available\footnote{The code of our project is available at \href{https://anonymous.4open.science/r/SOG-8432}https://anonymous.4open.science/r/SOG-8432.}.

Cite

Text

Wu et al. "<SO$G_k$>: One LLM Token for Explicit Graph Structural Understanding." International Conference on Learning Representations, 2026.

Markdown

[Wu et al. "<SO$G_k$>: One LLM Token for Explicit Graph Structural Understanding." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/wu2026iclr-sog/)

BibTeX

@inproceedings{wu2026iclr-sog,
  title     = {{<SO$G_k$>: One LLM Token for Explicit Graph Structural Understanding}},
  author    = {Wu, Jingyao and Lu, Bin and Di, Zijun and Gan, Xiaoying and Jin, Meng and Fu, Luoyi and Wang, Xinbing and Zhou, Chenghu},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/wu2026iclr-sog/}
}