Hierarchical POMDP Controller Optimization by Likelihood Maximization

Abstract

Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real-world problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximum-likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.

Cite

Text

Toussaint et al. "Hierarchical POMDP Controller Optimization by Likelihood Maximization." Conference on Uncertainty in Artificial Intelligence, 2008.

Markdown

[Toussaint et al. "Hierarchical POMDP Controller Optimization by Likelihood Maximization." Conference on Uncertainty in Artificial Intelligence, 2008.](https://mlanthology.org/uai/2008/toussaint2008uai-hierarchical/)

BibTeX

@inproceedings{toussaint2008uai-hierarchical,
  title     = {{Hierarchical POMDP Controller Optimization by Likelihood Maximization}},
  author    = {Toussaint, Marc and Charlin, Laurent and Poupart, Pascal},
  booktitle = {Conference on Uncertainty in Artificial Intelligence},
  year      = {2008},
  pages     = {562-570},
  url       = {https://mlanthology.org/uai/2008/toussaint2008uai-hierarchical/}
}