Dong et al. "Q-Learning with UCB Exploration Is Sample Efficient for Infinite-Horizon MDP." International Conference on Learning Representations, 2020.
Markdown
[Dong et al. "Q-Learning with UCB Exploration Is Sample Efficient for Infinite-Horizon MDP." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/dong2020iclr-qlearning/)
BibTeX
@inproceedings{dong2020iclr-qlearning,
title = {{Q-Learning with UCB Exploration Is Sample Efficient for Infinite-Horizon MDP}},
author = {Dong, Kefan and Wang, Yuanhao and Chen, Xiaoyu and Wang, Liwei},
booktitle = {International Conference on Learning Representations},
year = {2020},
url = {https://mlanthology.org/iclr/2020/dong2020iclr-qlearning/}
}