POMDPs Under Probabilistic Semantics
Abstract
We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated to every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) quantitative constraint defines the set of paths where the payoff is at least a given threshold λ in (0, 1]; and (ii) qualitative constraint which is a special case of quantitative constraint with λ = 1. We consider the computation of the almost-sure winning set, where the controller needs to ensure that the path constraint is satisfied with probability 1. Our main results for qualitative path constraint are as follows: (i) the problem of deciding the existence of a finite-memory controller is EXPTIME-complete; and (ii) the problem of deciding the existence of an infinite-memory controller is undecidable. For quantitative path constraint we show that the problem of deciding the existence of a finite-memory controller is undecidable.
Cite
Text
Chatterjee and Chmelik. "POMDPs Under Probabilistic Semantics." Conference on Uncertainty in Artificial Intelligence, 2013. doi:10.1016/j.artint.2014.12.009Markdown
[Chatterjee and Chmelik. "POMDPs Under Probabilistic Semantics." Conference on Uncertainty in Artificial Intelligence, 2013.](https://mlanthology.org/uai/2013/chatterjee2013uai-pomdps/) doi:10.1016/j.artint.2014.12.009BibTeX
@inproceedings{chatterjee2013uai-pomdps,
title = {{POMDPs Under Probabilistic Semantics}},
author = {Chatterjee, Krishnendu and Chmelik, Martin},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2013},
doi = {10.1016/j.artint.2014.12.009},
url = {https://mlanthology.org/uai/2013/chatterjee2013uai-pomdps/}
}