Efficient Inference and Exploration for Reinforcement Learning

Zhu, Yi; Dong, Jing; Lam, Henry

Efficient Inference and Exploration for Reinforcement Learning

ICLR 2020

/iclr/2020/zhu2020iclr-efficient/

Abstract

Despite an ever growing literature on reinforcement learning algorithms and applications, much less is known about their statistical inference. In this paper, we investigate the large-sample behaviors of the Q-value estimates with closed-form characterizations of the asymptotic variances. This allows us to efficiently construct confidence regions for Q-value and optimal value functions, and to develop policies to minimize their estimation errors. This also leads to a policy exploration strategy that relies on estimating the relative discrepancies among the Q estimates. Numerical experiments show superior performances of our exploration strategy than other benchmark approaches.

PDF ICLR Semantic Scholar

Cite

Text

Zhu et al. "Efficient Inference and Exploration for Reinforcement Learning." International Conference on Learning Representations, 2020.

Markdown

[Zhu et al. "Efficient Inference and Exploration for Reinforcement Learning." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/zhu2020iclr-efficient/)

BibTeX

@inproceedings{zhu2020iclr-efficient,
  title     = {{Efficient Inference and Exploration for Reinforcement Learning}},
  author    = {Zhu, Yi and Dong, Jing and Lam, Henry},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/zhu2020iclr-efficient/}
}