Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games
Abstract
Security problems can be modeled as two-player partially observable stochastic games with one-sided partial observability and infinite horizon (one-sided POSGs). We seek for optimal strategies of player 1 that correspond to robust strategies against the worst-case opponent (player 2) that is assumed to have a perfect information about the game. We present a novel algorithm for approximately solving one-sided POSGs based on the heuristic search value iteration (HSVI) for POMDPs. Our results include (1) theoretical properties of one-sided POSGs and their value functions, (2) guarantees showing the convergence of our algorithm to optimal strategies, and (3) practical demonstration of applicability and scalability of our algorithm on three different domains: pursuit-evasion, patrolling, and search games.
Cite
Text
Horák et al. "Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games." AAAI Conference on Artificial Intelligence, 2017. doi:10.1609/AAAI.V31I1.10597Markdown
[Horák et al. "Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games." AAAI Conference on Artificial Intelligence, 2017.](https://mlanthology.org/aaai/2017/horak2017aaai-heuristic/) doi:10.1609/AAAI.V31I1.10597BibTeX
@inproceedings{horak2017aaai-heuristic,
title = {{Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games}},
author = {Horák, Karel and Bosanský, Branislav and Pechoucek, Michal},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2017},
pages = {558-564},
doi = {10.1609/AAAI.V31I1.10597},
url = {https://mlanthology.org/aaai/2017/horak2017aaai-heuristic/}
}