Bayesian Network Structure Discovery Using Large Language Models
Abstract
Understanding probabilistic dependencies among variables is central to analyzing complex systems. Traditional structure learning methods often require extensive observational data or are limited by manual, error-prone incorporation of expert knowledge. Recent studies have explored using large language models (LLMs) for structure learning, but most treat LLMs as auxiliary tools for pre-processing or post-processing, leaving the core learning process data-driven. In this work, we introduce a unified framework for Bayesian network structure discovery that places LLMs at the center, supporting both data-free and data-aware settings. In the data-free regime, we introduce \textbf{PromptBN}, which leverages LLM reasoning over variable metadata to generate a complete directed acyclic graph (DAG) in a single call. PromptBN effectively enforces global consistency and acyclicity through dual validation, achieving constant $\mathcal{O}(1)$ query complexity. When observational data are available, we introduce \textbf{ReActBN} to further refine the initial graph. ReActBN combines statistical evidence with LLM by integrating a novel ReAct-style reasoning with configurable structure scores (e.g., Bayesian Information Criterion). Experiments demonstrate that our method outperforms prior data-only, LLM-only, and hybrid baselines, particularly in low- or no-data regimes and on out-of-distribution datasets.
Cite
Text
Zhang et al. "Bayesian Network Structure Discovery Using Large Language Models." Transactions on Machine Learning Research, 2026.Markdown
[Zhang et al. "Bayesian Network Structure Discovery Using Large Language Models." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/zhang2026tmlr-bayesian/)BibTeX
@article{zhang2026tmlr-bayesian,
title = {{Bayesian Network Structure Discovery Using Large Language Models}},
author = {Zhang, Yinghuan and Zhang, Yufei and Kordjamshidi, Parisa and Cui, Zijun},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/zhang2026tmlr-bayesian/}
}