Finite-Time Analysis of Asynchronous Stochastic Approximation and $q$-Learning

Abstract

We consider a general asynchronous Stochastic Approximation (SA) scheme featuring a weighted infinity-norm contractive operator, and prove a bound on its finite-time convergence rate on a single trajectory. Additionally, we specialize the result to asynchronous $Q$-learning. The resulting bound matches the sharpest available bound for synchronous $Q$-learning, and improves over previous known bounds for asynchronous $Q$-learning.

Cite

Text

Qu and Wierman. "Finite-Time Analysis of Asynchronous Stochastic Approximation and $q$-Learning." Conference on Learning Theory, 2020.

Markdown

[Qu and Wierman. "Finite-Time Analysis of Asynchronous Stochastic Approximation and $q$-Learning." Conference on Learning Theory, 2020.](https://mlanthology.org/colt/2020/qu2020colt-finitetime/)

BibTeX

@inproceedings{qu2020colt-finitetime,
  title     = {{Finite-Time Analysis of Asynchronous Stochastic Approximation and $q$-Learning}},
  author    = {Qu, Guannan and Wierman, Adam},
  booktitle = {Conference on Learning Theory},
  year      = {2020},
  pages     = {3185-3205},
  volume    = {125},
  url       = {https://mlanthology.org/colt/2020/qu2020colt-finitetime/}
}