Improved POMDP Tree Search Planning with Prioritized Action Branching
Abstract
Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. This paper proposes a method called PA-POMCPOW to sample a subset of the action space that provides varying mixtures of exploitation and exploration for inclusion in a search tree. The proposed method first evaluates the action space according to a score function that is a linear combination of expected reward and expected information gain. The actions with the highest score are then added to the search tree during tree expansion. Experiments show that PA-POMCPOW is able to outperform existing state-of-the-art solvers on problems with large discrete action spaces.
Cite
Text
Mern et al. "Improved POMDP Tree Search Planning with Prioritized Action Branching." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I13.17412Markdown
[Mern et al. "Improved POMDP Tree Search Planning with Prioritized Action Branching." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/mern2021aaai-improved/) doi:10.1609/AAAI.V35I13.17412BibTeX
@inproceedings{mern2021aaai-improved,
title = {{Improved POMDP Tree Search Planning with Prioritized Action Branching}},
author = {Mern, John and Yildiz, Anil and Bush, Lawrence and Mukerji, Tapan and Kochenderfer, Mykel J.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {11888-11894},
doi = {10.1609/AAAI.V35I13.17412},
url = {https://mlanthology.org/aaai/2021/mern2021aaai-improved/}
}