Robustness in Markov Decision Problems with Uncertain Transition Matrices
Abstract
Optimal solutions to Markov Decision Problems (MDPs) are very sen- sitive with respect to the state transition probabilities. In many practi- cal problems, the estimation of those probabilities is far from accurate. Hence, estimation errors are limiting factors in applying MDPs to real- world problems. We propose an algorithm for solving finite-state and finite-action MDPs, where the solution is guaranteed to be robust with respect to estimation errors on the state transition probabilities. Our al- gorithm involves a statistically accurate yet numerically efficient repre- sentation of uncertainty, via Kullback-Leibler divergence bounds. The worst-case complexity of the robust algorithm is the same as the origi- nal Bellman recursion. Hence, robustness can be added at practically no extra computing cost.
Cite
Text
Nilim and El Ghaoui. "Robustness in Markov Decision Problems with Uncertain Transition Matrices." Neural Information Processing Systems, 2003.Markdown
[Nilim and El Ghaoui. "Robustness in Markov Decision Problems with Uncertain Transition Matrices." Neural Information Processing Systems, 2003.](https://mlanthology.org/neurips/2003/nilim2003neurips-robustness/)BibTeX
@inproceedings{nilim2003neurips-robustness,
title = {{Robustness in Markov Decision Problems with Uncertain Transition Matrices}},
author = {Nilim, Arnab and El Ghaoui, Laurent},
booktitle = {Neural Information Processing Systems},
year = {2003},
pages = {839-846},
url = {https://mlanthology.org/neurips/2003/nilim2003neurips-robustness/}
}