Multi-Bellman Operator for Convergence of $q$-Learning with Linear Function Approximation
Abstract
We investigate the convergence of $Q$-learning with linear function approximation and introduce the multi-Bellman operator, an extension of the traditional Bellman operator. By analyzing the properties of this operator, we identify conditions under which the projected multi-Bellman operator becomes a contraction, yielding stronger fixed-point guarantees compared to the original Bellman operator. Building on these insights, we propose the multi-$Q$-learning algorithm, which achieves convergence and approximates the optimal solution with arbitrary precision. This contrasts with traditional $Q$-learning, which lacks such convergence guarantees. Finally, we empirically validate our theoretical results.
Cite
Text
Carvalho et al. "Multi-Bellman Operator for Convergence of $q$-Learning with Linear Function Approximation." Transactions on Machine Learning Research, 2025.Markdown
[Carvalho et al. "Multi-Bellman Operator for Convergence of $q$-Learning with Linear Function Approximation." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/carvalho2025tmlr-multibellman/)BibTeX
@article{carvalho2025tmlr-multibellman,
title = {{Multi-Bellman Operator for Convergence of $q$-Learning with Linear Function Approximation}},
author = {Carvalho, Diogo S. and Santos, Pedro A. and Melo, Francisco S.},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/carvalho2025tmlr-multibellman/}
}