Improved POMDP Tree Search Planning with Prioritized Action Branching

Abstract

Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. This paper proposes a method called PA-POMCPOW to sample a subset of the action space that provides varying mixtures of exploitation and exploration for inclusion in a search tree. The proposed method first evaluates the action space according to a score function that is a linear combination of expected reward and expected information gain. The actions with the highest score are then added to the search tree during tree expansion. Experiments show that PA-POMCPOW is able to outperform existing state-of-the-art solvers on problems with large discrete action spaces.

Cite

Text

Mern et al. "Improved POMDP Tree Search Planning with Prioritized Action Branching." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I13.17412

Markdown

[Mern et al. "Improved POMDP Tree Search Planning with Prioritized Action Branching." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/mern2021aaai-improved/) doi:10.1609/AAAI.V35I13.17412

BibTeX

@inproceedings{mern2021aaai-improved,
  title     = {{Improved POMDP Tree Search Planning with Prioritized Action Branching}},
  author    = {Mern, John and Yildiz, Anil and Bush, Lawrence and Mukerji, Tapan and Kochenderfer, Mykel J.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {11888-11894},
  doi       = {10.1609/AAAI.V35I13.17412},
  url       = {https://mlanthology.org/aaai/2021/mern2021aaai-improved/}
}