Finite-Sample Guarantees for Nash Q-Learning with Linear Function Approximation

Abstract

Nash Q-learning may be considered one of the first and most known algorithms in multi-agent reinforcement learning (MARL) for learning policies that constitute a Nash equilibrium of an underlying general-sum Markov game. Its original proof provided asymptotic guarantees and was for the tabular case. Recently, finite-sample guarantees have been provided using more modern RL techniques for the tabular case. Our work analyzes Nash Q-learning using linear function approximation – a representation regime introduced when the state space is large or continuous – and provides finite-sample guarantees that indicate its sample efficiency. We find that the obtained performance nearly matches an existing efficient result for single-agent RL under the same representation and has a polynomial gap when compared to the best-known result for the tabular case.

Cite

Text

Cisneros-Velarde and Koyejo. "Finite-Sample Guarantees for Nash Q-Learning with Linear Function Approximation." Uncertainty in Artificial Intelligence, 2023.

Markdown

[Cisneros-Velarde and Koyejo. "Finite-Sample Guarantees for Nash Q-Learning with Linear Function Approximation." Uncertainty in Artificial Intelligence, 2023.](https://mlanthology.org/uai/2023/cisnerosvelarde2023uai-finitesample/)

BibTeX

@inproceedings{cisnerosvelarde2023uai-finitesample,
  title     = {{Finite-Sample Guarantees for Nash Q-Learning with Linear Function Approximation}},
  author    = {Cisneros-Velarde, Pedro and Koyejo, Sanmi},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2023},
  pages     = {424-432},
  volume    = {216},
  url       = {https://mlanthology.org/uai/2023/cisnerosvelarde2023uai-finitesample/}
}