An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled Dependency Trees from Attention Score
Abstract
Figuring out how neural language models comprehend syntax acts as a key to revealing how they understand languages. We systematically analyzed methods for finding syntax structures in models, namely _probing_, and found limitations yet widely exist in previous probing practice. We proposed a method capable of estimating mutual information (MI) and extracting dependency trees from attention scores in a mathematical-rigorous way, requiring no additional network training effort. Compared with previous approaches, it has a much simpler model, while being able to probe more complex dependency trees, also transparent for fine-grained explanation. We tested our method on several open-source LLMs and demonstrated its effectiveness by systematically comparing it with a great many competitive baselines. Several informative conclusions can be drawn by further analysis of the results, shedding light on our method’s explanatory potential. Our code is released at https://github.com/ChristLBUPT/IPBP.
Cite
Text
Liu et al. "An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled Dependency Trees from Attention Score." International Conference on Learning Representations, 2026.Markdown
[Liu et al. "An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled Dependency Trees from Attention Score." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/liu2026iclr-informationtheoretic/)BibTeX
@inproceedings{liu2026iclr-informationtheoretic,
title = {{An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled Dependency Trees from Attention Score}},
author = {Liu, Hongxu and Ma, Jing and Wang, Xiaojie and Yuan, Caixia and Feng, Fangxiang},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/liu2026iclr-informationtheoretic/}
}