Q -Learning with Linear Function Approximation

Abstract

In this paper, we analyze the convergence of Q -learning with linear function approximation. We identify a set of conditions that implies the convergence of this method with probability 1, when a fixed learning policy is used. We discuss the differences and similarities between our results and those obtained in several related works. We also discuss the applicability of this method when a changing policy is used. Finally, we describe the applicability of this approximate method in partially observable scenarios.

Cite

Text

Melo and Ribeiro. "Q -Learning with Linear Function Approximation." Annual Conference on Computational Learning Theory, 2007. doi:10.1007/978-3-540-72927-3_23

Markdown

[Melo and Ribeiro. "Q -Learning with Linear Function Approximation." Annual Conference on Computational Learning Theory, 2007.](https://mlanthology.org/colt/2007/melo2007colt-q/) doi:10.1007/978-3-540-72927-3_23

BibTeX

@inproceedings{melo2007colt-q,
  title     = {{Q -Learning with Linear Function Approximation}},
  author    = {Melo, Francisco S. and Ribeiro, M. Isabel},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2007},
  pages     = {308-322},
  doi       = {10.1007/978-3-540-72927-3_23},
  url       = {https://mlanthology.org/colt/2007/melo2007colt-q/}
}