Black-Box Policy Search with Probabilistic Programs

Jan-Willem van de Meent, Brooks Paige, David Tolpin, Frank D. Wood

AISTATS 2016 pp. 1195-1204

/aistats/2016/vandemeent2016aistats-black/

Abstract

In this work, we explore how probabilistic programs can be used to represent policies in sequential decision problems. In this formulation, a probabilistic program is a black-box stochastic simulator for both the problem domain and the agent. We relate classic policy gradient techniques to recently introduced black-box variational methods which generalize to probabilistic program inference. We present case studies in the Canadian traveler problem, Rock Sample, and a benchmark for optimal diagnosis inspired by Guess Who. Each study illustrates how programs can efficiently represent policies using moderate numbers of parameters.

PDF AISTATS Semantic Scholar

Cite

Text

van de Meent et al. "Black-Box Policy Search with Probabilistic Programs." International Conference on Artificial Intelligence and Statistics, 2016.

Markdown

[van de Meent et al. "Black-Box Policy Search with Probabilistic Programs." International Conference on Artificial Intelligence and Statistics, 2016.](https://mlanthology.org/aistats/2016/vandemeent2016aistats-black/)

BibTeX

@inproceedings{vandemeent2016aistats-black,
  title     = {{Black-Box Policy Search with Probabilistic Programs}},
  author    = {van de Meent, Jan-Willem and Paige, Brooks and Tolpin, David and Wood, Frank D.},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  year      = {2016},
  pages     = {1195-1204},
  url       = {https://mlanthology.org/aistats/2016/vandemeent2016aistats-black/}
}