Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations
Abstract
: Partially-observable Markov decision processes provide a very general model for decision-theoretic planning problems, allowing the trade-offs between various courses of actions to be determined under conditions of uncertainty, and incorporating partial observations made by an agent. Dynamic programming algorithms based on the information or belief state of an agent can be used to construct optimal policies without explicit consideration of past history, but at high computational cost. In this paper, we discuss how structured representations of the system dynamics can be incorporated in classic POMDP solution algorithms. We use Bayesian networks with structured conditional probability matrices to represent POMDPs, and use this representation to structure the belief space for POMDP algorithms. This allows irrelevant distinctions to be ignored. Apart from speeding up optimal policy construction, we suggest that such representations can be exploited to great extent in the development of ...
Cite
Text
Boutilier and Poole. "Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations." AAAI Conference on Artificial Intelligence, 1996.Markdown
[Boutilier and Poole. "Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations." AAAI Conference on Artificial Intelligence, 1996.](https://mlanthology.org/aaai/1996/boutilier1996aaai-computing/)BibTeX
@inproceedings{boutilier1996aaai-computing,
title = {{Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations}},
author = {Boutilier, Craig and Poole, David},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {1996},
pages = {1168-1175},
url = {https://mlanthology.org/aaai/1996/boutilier1996aaai-computing/}
}