A Unified Switching System Perspective and Convergence Analysis of Q-Learning Algorithms

Abstract

This paper develops a novel and unified framework to analyze the convergence of a large family of Q-learning algorithms from the switching system perspective. We show that the nonlinear ODE models associated with Q-learning and many of its variants can be naturally formulated as affine switching systems. Building on their asymptotic stability, we obtain a number of interesting results: (i) we provide a simple ODE analysis for the convergence of asynchronous Q-learning under relatively weak assumptions; (ii) we establish the first convergence analysis of the averaging Q-learning algorithm; and (iii) we derive a new sufficient condition for the convergence of Q-learning with linear function approximation.

Cite

Text

Lee and He. "A Unified Switching System Perspective and Convergence Analysis of Q-Learning Algorithms." Neural Information Processing Systems, 2020.

Markdown

[Lee and He. "A Unified Switching System Perspective and Convergence Analysis of Q-Learning Algorithms." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/lee2020neurips-unified/)

BibTeX

@inproceedings{lee2020neurips-unified,
  title     = {{A Unified Switching System Perspective and Convergence Analysis of Q-Learning Algorithms}},
  author    = {Lee, Donghwan and He, Niao},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/lee2020neurips-unified/}
}