Q -Learning with Linear Function Approximation
Abstract
In this paper, we analyze the convergence of Q -learning with linear function approximation. We identify a set of conditions that implies the convergence of this method with probability 1, when a fixed learning policy is used. We discuss the differences and similarities between our results and those obtained in several related works. We also discuss the applicability of this method when a changing policy is used. Finally, we describe the applicability of this approximate method in partially observable scenarios.
Cite
Text
Melo and Ribeiro. "Q -Learning with Linear Function Approximation." Annual Conference on Computational Learning Theory, 2007. doi:10.1007/978-3-540-72927-3_23Markdown
[Melo and Ribeiro. "Q -Learning with Linear Function Approximation." Annual Conference on Computational Learning Theory, 2007.](https://mlanthology.org/colt/2007/melo2007colt-q/) doi:10.1007/978-3-540-72927-3_23BibTeX
@inproceedings{melo2007colt-q,
title = {{Q -Learning with Linear Function Approximation}},
author = {Melo, Francisco S. and Ribeiro, M. Isabel},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2007},
pages = {308-322},
doi = {10.1007/978-3-540-72927-3_23},
url = {https://mlanthology.org/colt/2007/melo2007colt-q/}
}