Finite-Time Theory for Momentum Q-Learning

Abstract

Existing studies indicate that momentum ideas in conventional optimization can be used to improve the performance of Q-learning algorithms. However, the finite-time analysis for momentum-based Q-learning algorithms is only available for the tabular case without function approximation. This paper analyzes a class of momentum-based Q-learning algorithms with finite-time convergence guarantee. Specifically, we propose the MomentumQ algorithm, which integrates the Nesterov’s and Polyak’s momentum schemes, and generalizes the existing momentum-based Q-learning algorithms. For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximation under Markovian sampling. In particular, we characterize a finite-time convergence rate which is provably faster than the vanilla Q-learning. This is the first finite-time analysis for momentum-based Q-learning algorithms with function approximation. For the tabular case under synchronous sampling, we also obtain a finite-time convergence rate that is slightly better than the SpeedyQ (Azar et al., NIPS 2011). Finally, we demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.

Cite

Text

Bowen et al. "Finite-Time Theory for Momentum Q-Learning." Uncertainty in Artificial Intelligence, 2021.

Markdown

[Bowen et al. "Finite-Time Theory for Momentum Q-Learning." Uncertainty in Artificial Intelligence, 2021.](https://mlanthology.org/uai/2021/bowen2021uai-finitetime/)

BibTeX

@inproceedings{bowen2021uai-finitetime,
  title     = {{Finite-Time Theory for Momentum Q-Learning}},
  author    = {Bowen, Weng and Huaqing, Xiong and Lin, Zhao and Yingbin, Liang and Wei, Zhang},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2021},
  pages     = {665-674},
  volume    = {161},
  url       = {https://mlanthology.org/uai/2021/bowen2021uai-finitetime/}
}