Solving POMDPs by Searching the Space of Finite Policies
Abstract
Solving partially observable Markov decision processes (POMDPS) is highly intractable in general, at least in part because the optimal policy may be infinitely large. In this paper, we explore the problem of finding the optimal policy from a restricted set of policies, represented as finite state automata of a given size. This problem is also intractable, but we show that the complexity can be greatly reduced when the POMDP andlor policy are further constrained. We demonstrate good empirical results with a branch-and-bound method for finding globally optimal deterministic policies, and a gradient-ascent method for finding locally optimal stochastic policies.
Cite
Text
Meuleau et al. "Solving POMDPs by Searching the Space of Finite Policies." Conference on Uncertainty in Artificial Intelligence, 1999.Markdown
[Meuleau et al. "Solving POMDPs by Searching the Space of Finite Policies." Conference on Uncertainty in Artificial Intelligence, 1999.](https://mlanthology.org/uai/1999/meuleau1999uai-solving/)BibTeX
@inproceedings{meuleau1999uai-solving,
title = {{Solving POMDPs by Searching the Space of Finite Policies}},
author = {Meuleau, Nicolas and Kim, Kee-Eung and Kaelbling, Leslie Pack and Cassandra, Anthony R.},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {1999},
pages = {417-426},
url = {https://mlanthology.org/uai/1999/meuleau1999uai-solving/}
}