Probabilistic Goal Markov Decision Processes
Abstract
The Markov decision process model is a powerful tool in planing tasks and sequential decision making problems. The randomness of state transitions and rewards implies that the performance of a policy is often stochastic. In contrast to the standard approach that studies the expected performance, we consider the policy that maximizes the probability of achieving a pre-determined target performance, a criterion we term *probabilistic goal Markov decision processes*. We show that this problem is NP-hard, but can be solved using a pseudo-polynomial algorithm. We further consider a variant dubbed "chance-constraint Markov decision problems," that treats the probability of achieving target performance as a constraint instead of the maximizing objective. This variant is NP-hard, but can be solved in pseudo-polynomial time.
Cite
Text
Xu and Mannor. "Probabilistic Goal Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2011. doi:10.5591/978-1-57735-516-8/IJCAI11-341Markdown
[Xu and Mannor. "Probabilistic Goal Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2011.](https://mlanthology.org/ijcai/2011/xu2011ijcai-probabilistic/) doi:10.5591/978-1-57735-516-8/IJCAI11-341BibTeX
@inproceedings{xu2011ijcai-probabilistic,
title = {{Probabilistic Goal Markov Decision Processes}},
author = {Xu, Huan and Mannor, Shie},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2011},
pages = {2046-2052},
doi = {10.5591/978-1-57735-516-8/IJCAI11-341},
url = {https://mlanthology.org/ijcai/2011/xu2011ijcai-probabilistic/}
}