Structural Causal Bandits Under Markov Equivalence

Abstract

In decision-making processes, an intelligent agent with causal knowledge can optimize action spaces to avoid unnecessary exploration. A *structural causal bandit* framework provides guidance on how to prune actions that are unable to maximize reward by leveraging prior knowledge of the underlying causal structure among actions. A key assumption of this framework is that the agent has access to a fully-specified causal diagram representing the target system. In this paper, we extend the structural causal bandits to scenarios where the agent leverages a Markov equivalence class. In such cases, the causal structure is provided to the agent in the form of a *partial ancestral graph* (PAG). We propose a generalized framework for identifying potentially optimal actions within this graph structure, thereby broadening the applicability of structural causal bandits.

Cite

Text

Park et al. "Structural Causal Bandits Under Markov Equivalence." Advances in Neural Information Processing Systems, 2025.

Markdown

[Park et al. "Structural Causal Bandits Under Markov Equivalence." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/park2025neurips-structural/)

BibTeX

@inproceedings{park2025neurips-structural,
  title     = {{Structural Causal Bandits Under Markov Equivalence}},
  author    = {Park, Min Woo and Arditi, Andy and Bareinboim, Elias and Lee, Sanghack},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/park2025neurips-structural/}
}