Learning Probably Approximately Complete and Safe Action Models for Stochastic Worlds

Abstract

We consider the problem of learning action models for planning in unknown stochastic environments that can be defined using the Probabilistic Planning Domain Description Language (PPDDL). As input, we are given a set of previously executed trajectories, and the main challenge is to learn an action model that has a similar goal achievement probability to the policies used to create these trajectories. To this end, we introduce a variant of PPDDL in which there is uncertainty about the transition probabilities, specified by an interval for each factor that contains the respective true transition probabilities. Then, we present SAM+, an algorithm that learns such an imprecise-PPDDL environment model. SAM+ has a polynomial time and sample complexity, and guarantees that with high probability, the true environment is indeed captured by the defined intervals. We prove that the action model SAM+ outputs has a goal achievement probability that is almost as good or better than that of the policies used to produced the training trajectories. Then, we show how to produce a PPDDL model based on this imprecise-PPDDL model that has similar properties.

Cite

Text

Juba and Stern. "Learning Probably Approximately Complete and Safe Action Models for Stochastic Worlds." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I9.21215

Markdown

[Juba and Stern. "Learning Probably Approximately Complete and Safe Action Models for Stochastic Worlds." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/juba2022aaai-learning/) doi:10.1609/AAAI.V36I9.21215

BibTeX

@inproceedings{juba2022aaai-learning,
  title     = {{Learning Probably Approximately Complete and Safe Action Models for Stochastic Worlds}},
  author    = {Juba, Brendan and Stern, Roni},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {9795-9804},
  doi       = {10.1609/AAAI.V36I9.21215},
  url       = {https://mlanthology.org/aaai/2022/juba2022aaai-learning/}
}