A Bayesian Sampling Approach to Exploration in Reinforcement Learning
Abstract
We present a modular approach to reinforcement learning that uses a Bayesian representation of the uncertainty over models. The approach, BOSS (Best of Sampled Set), drives exploration by sampling multiple models from the posterior and selecting actions optimistically. It extends previous work by providing a rule for deciding when to re-sample and how to combine the models. We show that our algorithm achieves near-optimal reward with high probability with a sample complexity that is low relative to the speed at which the posterior distribution converges during learning. We demonstrate that BOSS performs quite favorably compared to state-of-the-art reinforcement-learning approaches and illustrate its flexibility by pairing it with a non-parametric model that generalizes across states.
Cite
Text
Asmuth et al. "A Bayesian Sampling Approach to Exploration in Reinforcement Learning." Conference on Uncertainty in Artificial Intelligence, 2009.Markdown
[Asmuth et al. "A Bayesian Sampling Approach to Exploration in Reinforcement Learning." Conference on Uncertainty in Artificial Intelligence, 2009.](https://mlanthology.org/uai/2009/asmuth2009uai-bayesian/)BibTeX
@inproceedings{asmuth2009uai-bayesian,
title = {{A Bayesian Sampling Approach to Exploration in Reinforcement Learning}},
author = {Asmuth, John and Li, Lihong and Littman, Michael L. and Nouri, Ali and Wingate, David},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2009},
pages = {19-26},
url = {https://mlanthology.org/uai/2009/asmuth2009uai-bayesian/}
}