Scalable Initial State Interdiction for Factored MDPs

Abstract

We propose a novel Stackelberg game model of MDP interdiction in which the defender modifies the initial state of the planner, who then responds by computing an optimal policy starting with that state. We first develop a novel approach for MDP interdiction in factored state space that allows the defender to modify the initial state. The resulting approach can be computationally expensive for large factored MDPs. To address this, we develop several interdiction algorithms that leverage variations of reinforcement learning using both linear and non-linear function approximation. Finally, we extend the interdiction framework to consider a Bayesian interdiction problem in which the interdictor is uncertain about some of the planner's initial state features. Extensive experiments demonstrate the effectiveness of our approaches.

Cite

Text

Panda and Vorobeychik. "Scalable Initial State Interdiction for Factored MDPs." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/667

Markdown

[Panda and Vorobeychik. "Scalable Initial State Interdiction for Factored MDPs." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/panda2018ijcai-scalable/) doi:10.24963/IJCAI.2018/667

BibTeX

@inproceedings{panda2018ijcai-scalable,
  title     = {{Scalable Initial State Interdiction for Factored MDPs}},
  author    = {Panda, Swetasudha and Vorobeychik, Yevgeniy},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {4801-4807},
  doi       = {10.24963/IJCAI.2018/667},
  url       = {https://mlanthology.org/ijcai/2018/panda2018ijcai-scalable/}
}