Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes
Abstract
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.
Cite
Text
Grau-Moya et al. "Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016. doi:10.1007/978-3-319-46227-1_30Markdown
[Grau-Moya et al. "Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2016.](https://mlanthology.org/ecmlpkdd/2016/graumoya2016ecmlpkdd-planning/) doi:10.1007/978-3-319-46227-1_30BibTeX
@inproceedings{graumoya2016ecmlpkdd-planning,
title = {{Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes}},
author = {Grau-Moya, Jordi and Leibfried, Felix and Genewein, Tim and Braun, Daniel A.},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2016},
pages = {475-491},
doi = {10.1007/978-3-319-46227-1_30},
url = {https://mlanthology.org/ecmlpkdd/2016/graumoya2016ecmlpkdd-planning/}
}