Learning Rates for Q-Learning
Abstract
In this paper we derive convergence rates for Q-learning. We show an interesting relationship between the convergence rate and the learning rate used in the Q-learning. For a polynomial learning rate, one which is 1/ t ^ω at time t where ω ε (1/2, 1), we show that that the convergence rate is polynomial in 1/(1 - γ), where γ is the discount factor. In contrast we show that for a linear learning rate, one which is 1/ t at time t , the convergence rate has an exponential dependence on 1/(1 - γ). In addition we show a simple example that proves that this exponential behavior is inherent for a linear learning rate.
Cite
Text
Even-Dar and Mansour. "Learning Rates for Q-Learning." Annual Conference on Computational Learning Theory, 2001. doi:10.1007/3-540-44581-1_39Markdown
[Even-Dar and Mansour. "Learning Rates for Q-Learning." Annual Conference on Computational Learning Theory, 2001.](https://mlanthology.org/colt/2001/evendar2001colt-learning/) doi:10.1007/3-540-44581-1_39BibTeX
@inproceedings{evendar2001colt-learning,
title = {{Learning Rates for Q-Learning}},
author = {Even-Dar, Eyal and Mansour, Yishay},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2001},
pages = {589-604},
doi = {10.1007/3-540-44581-1_39},
url = {https://mlanthology.org/colt/2001/evendar2001colt-learning/}
}