Learning General Planning Policies from Small Examples Without Supervision

Abstract

Generalized planning is concerned with the computation of general policies that solve multiple instances of a planning domain all at once. It has been recently shown that these policies can be computed in two steps: first, a suitable abstraction in the form of a qualitative numerical planning problem (QNP) is learned from sample plans, then the general policies are obtained from the learned QNP using a planner. In this work, we introduce an alternative approach for computing more expressive general policies which does not require sample plans or a QNP planner. The new formulation is very simple and can be cast in terms that are more standard in machine learning: a large but finite pool of features is defined from the predicates in the planning examples using a general grammar, and a small subset of features is sought for separating “good” from “bad” state transitions, and goals from non-goals. The problems of finding such a “separating surface” while labeling the transitions as “good” or “bad” are jointly addressed as a single combinatorial optimization problem expressed as a Weighted Max-SAT problem. The advantage of looking for the simplest policy in the given feature space that solves the given examples, possibly non-optimally, is that many domains have no general, compact policies that are optimal. The approach yields general policies for a number of benchmark domains.

Cite

Text

Francès et al. "Learning General Planning Policies from Small Examples Without Supervision." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I13.17402

Markdown

[Francès et al. "Learning General Planning Policies from Small Examples Without Supervision." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/frances2021aaai-learning/) doi:10.1609/AAAI.V35I13.17402

BibTeX

@inproceedings{frances2021aaai-learning,
  title     = {{Learning General Planning Policies from Small Examples Without Supervision}},
  author    = {Francès, Guillem and Bonet, Blai and Geffner, Hector},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {11801-11808},
  doi       = {10.1609/AAAI.V35I13.17402},
  url       = {https://mlanthology.org/aaai/2021/frances2021aaai-learning/}
}