A Causal Bandit Approach to Learning Good Atomic Interventions in Presence of Unobserved Confounders
Abstract
We study the problem of determining the best atomic intervention in a Causal Bayesian Network (CBN) specified only by its causal graph. We model this as a stochastic multi-armed bandit (MAB) problem with side-information, where interventions on CBN correspond to arms of the bandit instance. First, we propose a simple regret minimization algorithm that takes as input a causal graph with observable and unobservable nodes and in $T$ exploration rounds achieves $\tilde{O}(\sqrt{m(\mathcal{C})/T})$ expected simple regret. Here $m(\mathcal{C})$ is a parameter dependent on the input CBN $\mathcal{C}$ and could be much smaller than the number of arms. We also show that this is almost optimal for CBNs whose causal graphs have an $n$-ary tree structure. Next, we propose a cumulative regret minimization algorithm that takes as input a causal graph with observable nodes and performs better than the optimal MAB algorithms that do not use causal side-information. We experimentally compare both our algorithms with the best known algorithms in the literature.
Cite
Text
Maiti et al. "A Causal Bandit Approach to Learning Good Atomic Interventions in Presence of Unobserved Confounders." Uncertainty in Artificial Intelligence, 2022.Markdown
[Maiti et al. "A Causal Bandit Approach to Learning Good Atomic Interventions in Presence of Unobserved Confounders." Uncertainty in Artificial Intelligence, 2022.](https://mlanthology.org/uai/2022/maiti2022uai-causal/)BibTeX
@inproceedings{maiti2022uai-causal,
title = {{A Causal Bandit Approach to Learning Good Atomic Interventions in Presence of Unobserved Confounders}},
author = {Maiti, Aurghya and Nair, Vineet and Sinha, Gaurav},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2022},
pages = {1328-1338},
volume = {180},
url = {https://mlanthology.org/uai/2022/maiti2022uai-causal/}
}