Variational Inequality Perspective and Optimizers for Multi-Agent Reinforcement Learning

Sidahmed, Baraah A. M.; Chavdarova, Tatjana

Variational Inequality Perspective and Optimizers for Multi-Agent Reinforcement Learning

Baraah A. M. Sidahmed, Tatjana Chavdarova

NeurIPSW 2024

/neuripsw/2024/sidahmed2024neuripsw-variational/

Abstract

Multi-agent reinforcement learning (MARL) presents unique challenges as agents learn strategies through trial and error. Gradient-based methods are often sensitive to hyperparameter selection and initial random seed variations. Recently, progress has been made in solving problems modeled by Variational Inequalities (VIs)—which include equilibrium-finding problems—particularly in addressing the non-converging rotational dynamics that impede convergence of traditional gradient-based optimization methods. This paper explores the potential of leveraging VI-based techniques to improve MARL training. Specifically, we study the performance of VI methods—namely, Nested-Lookahead VI (nLA-VI) and Extragradient (EG)—in enhancing the multi-agent deep deterministic policy gradient (MADDPG) algorithm. We present a VI reformulation of the actor-critic algorithm for both single- and multi-agent settings. We introduce three algorithms that use nLA-VI, EG, and a combination of both, named LA-MADDPG, EG-MADDPG, and LA-EG-MADDPG, respectively. Our empirical results demonstrate that these VI-based approaches yield significant performance improvements in benchmark environments, such as the zero-sum games: rock-paper-scissors and matching pennies, where equilibrium strategies can be quantitatively assessed, and the MPE Predator-prey environment [Lowe et al., 2017], where VI-based methods also foster more balanced participation among agents on the same team.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Sidahmed and Chavdarova. "Variational Inequality Perspective and Optimizers for Multi-Agent Reinforcement Learning." NeurIPS 2024 Workshops: OWA, 2024.

Markdown

[Sidahmed and Chavdarova. "Variational Inequality Perspective and Optimizers for Multi-Agent Reinforcement Learning." NeurIPS 2024 Workshops: OWA, 2024.](https://mlanthology.org/neuripsw/2024/sidahmed2024neuripsw-variational/)

BibTeX

@inproceedings{sidahmed2024neuripsw-variational,
  title     = {{Variational Inequality Perspective and Optimizers for Multi-Agent Reinforcement Learning}},
  author    = {Sidahmed, Baraah A. M. and Chavdarova, Tatjana},
  booktitle = {NeurIPS 2024 Workshops: OWA},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/sidahmed2024neuripsw-variational/}
}