Finite-Time Analysis of Asynchronous Stochastic Approximation and $q$-Learning
Abstract
We consider a general asynchronous Stochastic Approximation (SA) scheme featuring a weighted infinity-norm contractive operator, and prove a bound on its finite-time convergence rate on a single trajectory. Additionally, we specialize the result to asynchronous $Q$-learning. The resulting bound matches the sharpest available bound for synchronous $Q$-learning, and improves over previous known bounds for asynchronous $Q$-learning.
Cite
Text
Qu and Wierman. "Finite-Time Analysis of Asynchronous Stochastic Approximation and $q$-Learning." Conference on Learning Theory, 2020.Markdown
[Qu and Wierman. "Finite-Time Analysis of Asynchronous Stochastic Approximation and $q$-Learning." Conference on Learning Theory, 2020.](https://mlanthology.org/colt/2020/qu2020colt-finitetime/)BibTeX
@inproceedings{qu2020colt-finitetime,
title = {{Finite-Time Analysis of Asynchronous Stochastic Approximation and $q$-Learning}},
author = {Qu, Guannan and Wierman, Adam},
booktitle = {Conference on Learning Theory},
year = {2020},
pages = {3185-3205},
volume = {125},
url = {https://mlanthology.org/colt/2020/qu2020colt-finitetime/}
}