Asynchronous Stochastic Approximation and Q-Learning

Tsitsiklis, John N.

doi:10.1007/BF00993306

Asynchronous Stochastic Approximation and Q-Learning

John N. Tsitsiklis

MLJ 1994 pp. 185-202

doi:10.1007/BF00993306 /mlj/1994/tsitsiklis1994mlj-asynchronous/

Abstract

We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previously available.

PDF MLJ Semantic Scholar

Cite

Text

Tsitsiklis. "Asynchronous Stochastic Approximation and Q-Learning." Machine Learning, 1994. doi:10.1007/BF00993306

Markdown

[Tsitsiklis. "Asynchronous Stochastic Approximation and Q-Learning." Machine Learning, 1994.](https://mlanthology.org/mlj/1994/tsitsiklis1994mlj-asynchronous/) doi:10.1007/BF00993306

BibTeX

@article{tsitsiklis1994mlj-asynchronous,
  title     = {{Asynchronous Stochastic Approximation and Q-Learning}},
  author    = {Tsitsiklis, John N.},
  journal   = {Machine Learning},
  year      = {1994},
  pages     = {185-202},
  doi       = {10.1007/BF00993306},
  volume    = {16},
  url       = {https://mlanthology.org/mlj/1994/tsitsiklis1994mlj-asynchronous/}
}