An Analysis of Reinforcement Learning with Function Approximation

Melo, Francisco S.; Meyn, Sean P.; Ribeiro, M. Isabel

doi:10.1145/1390156.1390240

An Analysis of Reinforcement Learning with Function Approximation

Francisco S. Melo, Sean P. Meyn, M. Isabel Ribeiro

ICML 2008 pp. 664-671

doi:10.1145/1390156.1390240 /icml/2008/melo2008icml-analysis/

Abstract

We address the problem of computing the optimal Q-function in Markov decision problems with infinite state-space. We analyze the convergence properties of several variations of Q-learning when combined with function approximation, extending the analysis of TD-learning in (Tsitsilis and Van Roy, 1996) to stochastic control settings. We identify conditions under which such approximate methods converge with probability 1. We conclude with a brief discussion on the general applicability of our results and compare them with several related works.

PDF ICML Semantic Scholar

Cite

Text

Melo et al. "An Analysis of Reinforcement Learning with Function Approximation." International Conference on Machine Learning, 2008. doi:10.1145/1390156.1390240

Markdown

[Melo et al. "An Analysis of Reinforcement Learning with Function Approximation." International Conference on Machine Learning, 2008.](https://mlanthology.org/icml/2008/melo2008icml-analysis/) doi:10.1145/1390156.1390240

BibTeX

@inproceedings{melo2008icml-analysis,
  title     = {{An Analysis of Reinforcement Learning with Function Approximation}},
  author    = {Melo, Francisco S. and Meyn, Sean P. and Ribeiro, M. Isabel},
  booktitle = {International Conference on Machine Learning},
  year      = {2008},
  pages     = {664-671},
  doi       = {10.1145/1390156.1390240},
  url       = {https://mlanthology.org/icml/2008/melo2008icml-analysis/}
}