Sample Efficient Reinforcement Learning with Gaussian Processes

Abstract

This paper derives sample complexity results for using Gaussian Processes (GPs) in both model-based and model-free reinforcement learning (RL). We show that GPs are KWIK learnable, proving for the first time that a model-based RL approach using GPs, GP-Rmax, is sample efficient (PAC-MDP). However, we then show that previous approaches to model-free RL using GPs take an exponential number of steps to find an optimal policy, and are therefore not sample efficient. The third and main contribution is the introduction of a model-free RL algorithm using GPs, DGPQ, which is sample efficient and, in contrast to model-based algorithms, capable of acting in real time, as demonstrated on a five-dimensional aircraft simulator.

Cite

Text

Grande et al. "Sample Efficient Reinforcement Learning with Gaussian Processes." International Conference on Machine Learning, 2014.

Markdown

[Grande et al. "Sample Efficient Reinforcement Learning with Gaussian Processes." International Conference on Machine Learning, 2014.](https://mlanthology.org/icml/2014/grande2014icml-sample/)

BibTeX

@inproceedings{grande2014icml-sample,
  title     = {{Sample Efficient Reinforcement Learning with Gaussian Processes}},
  author    = {Grande, Robert and Walsh, Thomas and How, Jonathan},
  booktitle = {International Conference on Machine Learning},
  year      = {2014},
  pages     = {1332-1340},
  volume    = {32},
  url       = {https://mlanthology.org/icml/2014/grande2014icml-sample/}
}