PAC Optimal Exploration in Continuous Space Markov Decision Processes

Abstract

Current exploration algorithms can be classified in two broad categories: Heuristic, and PAC optimal. While numerous researchers have used heuristic approaches such as epsilon-greedy exploration successfully, such approaches lack formal, finite sample guarantees and may need a significant amount of fine-tuning to produce good results. PAC optimal exploration algorithms, on the other hand, offer strong theoretical guarantees but are inapplicable in domains of realistic size. The goal of this paper is to bridge the gap between theory and practice, by introducing C-PACE, an algorithm which offers strong theoretical guarantees and can be applied to interesting, continuous space problems.

Cite

Text

Pazis and Parr. "PAC Optimal Exploration in Continuous Space Markov Decision Processes." AAAI Conference on Artificial Intelligence, 2013. doi:10.1609/AAAI.V27I1.8678

Markdown

[Pazis and Parr. "PAC Optimal Exploration in Continuous Space Markov Decision Processes." AAAI Conference on Artificial Intelligence, 2013.](https://mlanthology.org/aaai/2013/pazis2013aaai-pac/) doi:10.1609/AAAI.V27I1.8678

BibTeX

@inproceedings{pazis2013aaai-pac,
  title     = {{PAC Optimal Exploration in Continuous Space Markov Decision Processes}},
  author    = {Pazis, Jason and Parr, Ronald},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2013},
  pages     = {774-781},
  doi       = {10.1609/AAAI.V27I1.8678},
  url       = {https://mlanthology.org/aaai/2013/pazis2013aaai-pac/}
}