APRICODD: Approximate Policy Construction Using Decision Diagrams

Robert St-Aubin, Jesse Hoey, Craig Boutilier

NeurIPS 2000 pp. 1089-1095

/neurips/2000/staubin2000neurips-apricodd/

Abstract

We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and policies with much lower time and space requirements than exact dynamic programming. Our method reduces the sizes of the intermediate value functions generated during value iteration by replacing the values at the terminals of the ADD with ranges of values. Our method is demonstrated on a class of large MDPs (with up to 34 billion states), and we compare the results with the optimal value functions.

PDF NeurIPS Semantic Scholar

Cite

Text

St-Aubin et al. "APRICODD: Approximate Policy Construction Using Decision Diagrams." Neural Information Processing Systems, 2000.

Markdown

[St-Aubin et al. "APRICODD: Approximate Policy Construction Using Decision Diagrams." Neural Information Processing Systems, 2000.](https://mlanthology.org/neurips/2000/staubin2000neurips-apricodd/)

BibTeX

@inproceedings{staubin2000neurips-apricodd,
  title     = {{APRICODD: Approximate Policy Construction Using Decision Diagrams}},
  author    = {St-Aubin, Robert and Hoey, Jesse and Boutilier, Craig},
  booktitle = {Neural Information Processing Systems},
  year      = {2000},
  pages     = {1089-1095},
  url       = {https://mlanthology.org/neurips/2000/staubin2000neurips-apricodd/}
}