Information Gathering and Reward Exploitation of Subgoals for POMDPs
Abstract
Planning in large partially observable Markov decision processes (POMDPs) is challenging especially when a long planning horizon is required. A few recent algorithms successfully tackle this case but at the expense of a weaker information-gathering capacity. In this paper, we propose Information Gathering and Reward Exploitation of Subgoals (IGRES), a randomized POMDP planning algorithm that leverages information in the state space to automatically generate "macro-actions" to tackle tasks with long planning horizons, while locally exploring the belief space to allow effective information gathering. Experimental results show that IGRES is an effective multi-purpose POMDP solver, providing state-of-the-art performance for both long horizon planning tasks and information-gathering tasks on benchmark domains. Additional experiments with an ecological adaptive management problem indicate that IGRES is a promising tool for POMDP planning in real-world settings.
Cite
Text
Ma and Pineau. "Information Gathering and Reward Exploitation of Subgoals for POMDPs." AAAI Conference on Artificial Intelligence, 2015. doi:10.1609/AAAI.V29I1.9659Markdown
[Ma and Pineau. "Information Gathering and Reward Exploitation of Subgoals for POMDPs." AAAI Conference on Artificial Intelligence, 2015.](https://mlanthology.org/aaai/2015/ma2015aaai-information/) doi:10.1609/AAAI.V29I1.9659BibTeX
@inproceedings{ma2015aaai-information,
title = {{Information Gathering and Reward Exploitation of Subgoals for POMDPs}},
author = {Ma, Hang and Pineau, Joelle},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2015},
pages = {3320-3326},
doi = {10.1609/AAAI.V29I1.9659},
url = {https://mlanthology.org/aaai/2015/ma2015aaai-information/}
}