Hierarchical POMDP Controller Optimization by Likelihood Maximization
Abstract
Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real-world problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximum-likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.
Cite
Text
Toussaint et al. "Hierarchical POMDP Controller Optimization by Likelihood Maximization." Conference on Uncertainty in Artificial Intelligence, 2008.Markdown
[Toussaint et al. "Hierarchical POMDP Controller Optimization by Likelihood Maximization." Conference on Uncertainty in Artificial Intelligence, 2008.](https://mlanthology.org/uai/2008/toussaint2008uai-hierarchical/)BibTeX
@inproceedings{toussaint2008uai-hierarchical,
title = {{Hierarchical POMDP Controller Optimization by Likelihood Maximization}},
author = {Toussaint, Marc and Charlin, Laurent and Poupart, Pascal},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2008},
pages = {562-570},
url = {https://mlanthology.org/uai/2008/toussaint2008uai-hierarchical/}
}