Reduction Techniques for Model Checking and Learning in MDPs
Abstract
Omega-regular objectives in Markov decision processes (MDPs) reduce to reachability: find a policy which maximizes the probability of reaching a target set of states. Given an MDP, an initial distribution, and a target set of states, such a policy can be computed by most probabilistic model checking tools. If the MDP is only partially specified, i.e., some prob- abilities are unknown, then model-learning techniques can be used to statistically approximate the probabilities and enable the computation of the de- sired policy. For fully specified MDPs, reducing the size of the MDP translates into faster model checking; for partially specified MDPs, into faster learning. We provide reduction techniques that al- low us to remove irrelevant transition probabilities: transition probabilities (known, or to be learned) that do not influence the maximal reachability probability. Among other applications, these reductions can be seen as a pre-processing of MDPs before model checking or as a way to reduce the number of experiments required to obtain a good approximation of an unknown MDP.
Cite
Text
Bharadwaj et al. "Reduction Techniques for Model Checking and Learning in MDPs." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/597Markdown
[Bharadwaj et al. "Reduction Techniques for Model Checking and Learning in MDPs." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/bharadwaj2017ijcai-reduction/) doi:10.24963/IJCAI.2017/597BibTeX
@inproceedings{bharadwaj2017ijcai-reduction,
title = {{Reduction Techniques for Model Checking and Learning in MDPs}},
author = {Bharadwaj, Suda and Le Roux, Stéphane and Pérez, Guillermo A. and Topcu, Ufuk},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2017},
pages = {4273-4279},
doi = {10.24963/IJCAI.2017/597},
url = {https://mlanthology.org/ijcai/2017/bharadwaj2017ijcai-reduction/}
}