A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms

Szepesvári, Csaba; Littman, Michael L.

doi:10.1162/089976699300016070

A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms

Csaba Szepesvári, Michael L. Littman

NeCo 1999 pp. 2017-2060

doi:10.1162/089976699300016070 /neco/1999/szepesvari1999neco-unified/

Abstract

Reinforcement learning is the problem of generating optimal behavior in a sequential decision-making environment given the opportunity of interacting with it. Many algorithms for solving reinforcement-learning problems work by computing improved estimates of the optimal value function. We extend prior analyses of reinforcement-learning algorithms and present a powerful new theorem that can provide a unified analysis of such value-function-based reinforcement-learning algorithms. The usefulness of the theorem lies in how it allows the convergence of a complex asynchronous reinforcement-learning algorithm to be proved by verifying that a simpler synchronous algorithm converges. We illustrate the application of the theorem by analyzing the convergence of Q-learning, model-based reinforcement learning, Q-learning with multistate updates, Q-learning for Markov games, and risk-sensitive reinforcement learning.

PDF NeCo Semantic Scholar

Cite

Text

Szepesvári and Littman. "A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms." Neural Computation, 1999. doi:10.1162/089976699300016070

Markdown

[Szepesvári and Littman. "A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms." Neural Computation, 1999.](https://mlanthology.org/neco/1999/szepesvari1999neco-unified/) doi:10.1162/089976699300016070

BibTeX

@article{szepesvari1999neco-unified,
  title     = {{A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms}},
  author    = {Szepesvári, Csaba and Littman, Michael L.},
  journal   = {Neural Computation},
  year      = {1999},
  pages     = {2017-2060},
  doi       = {10.1162/089976699300016070},
  volume    = {11},
  url       = {https://mlanthology.org/neco/1999/szepesvari1999neco-unified/}
}