New Criteria and a New Algorithm for Learning in Multi-Agent Systems

Abstract

We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightfor- wardly in repeated games with average rewards, consist of three require- ments: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm's payoff at least approach (and possibly exceed) the security level payoff (or max- imin value), and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms.

Cite

Text

Powers and Shoham. "New Criteria and a New Algorithm for Learning in Multi-Agent Systems." Neural Information Processing Systems, 2004.

Markdown

[Powers and Shoham. "New Criteria and a New Algorithm for Learning in Multi-Agent Systems." Neural Information Processing Systems, 2004.](https://mlanthology.org/neurips/2004/powers2004neurips-new/)

BibTeX

@inproceedings{powers2004neurips-new,
  title     = {{New Criteria and a New Algorithm for Learning in Multi-Agent Systems}},
  author    = {Powers, Rob and Shoham, Yoav},
  booktitle = {Neural Information Processing Systems},
  year      = {2004},
  pages     = {1089-1096},
  url       = {https://mlanthology.org/neurips/2004/powers2004neurips-new/}
}