Probabilistic Goal Markov Decision Processes

Abstract

The Markov decision process model is a powerful tool in planing tasks and sequential decision making problems. The randomness of state transitions and rewards implies that the performance of a policy is often stochastic. In contrast to the standard approach that studies the expected performance, we consider the policy that maximizes the probability of achieving a pre-determined target performance, a criterion we term *probabilistic goal Markov decision processes*. We show that this problem is NP-hard, but can be solved using a pseudo-polynomial algorithm. We further consider a variant dubbed "chance-constraint Markov decision problems," that treats the probability of achieving target performance as a constraint instead of the maximizing objective. This variant is NP-hard, but can be solved in pseudo-polynomial time.

Cite

Text

Xu and Mannor. "Probabilistic Goal Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2011. doi:10.5591/978-1-57735-516-8/IJCAI11-341

Markdown

[Xu and Mannor. "Probabilistic Goal Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2011.](https://mlanthology.org/ijcai/2011/xu2011ijcai-probabilistic/) doi:10.5591/978-1-57735-516-8/IJCAI11-341

BibTeX

@inproceedings{xu2011ijcai-probabilistic,
  title     = {{Probabilistic Goal Markov Decision Processes}},
  author    = {Xu, Huan and Mannor, Shie},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2011},
  pages     = {2046-2052},
  doi       = {10.5591/978-1-57735-516-8/IJCAI11-341},
  url       = {https://mlanthology.org/ijcai/2011/xu2011ijcai-probabilistic/}
}