Deep Exploration via Randomized Value Functions

Abstract

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to value function learning. We present several reinforcement learning algorithms that leverage randomized value functions and demonstrate their efficacy through computational studies. We also prove a regret bound that establishes statistical efficiency with a tabular representation.

Cite

Text

Osband et al. "Deep Exploration via Randomized Value Functions." Journal of Machine Learning Research, 2019.

Markdown

[Osband et al. "Deep Exploration via Randomized Value Functions." Journal of Machine Learning Research, 2019.](https://mlanthology.org/jmlr/2019/osband2019jmlr-deep/)

BibTeX

@article{osband2019jmlr-deep,
  title     = {{Deep Exploration via Randomized Value Functions}},
  author    = {Osband, Ian and Van Roy, Benjamin and Russo, Daniel J. and Wen, Zheng},
  journal   = {Journal of Machine Learning Research},
  year      = {2019},
  pages     = {1-62},
  volume    = {20},
  url       = {https://mlanthology.org/jmlr/2019/osband2019jmlr-deep/}
}