Performance Bounded Reinforcement Learning in Strategic Interactions

Banerjee, Bikramjit; Peng, Jing

Performance Bounded Reinforcement Learning in Strategic Interactions

AAAI 2004 pp. 2-7

/aaai/2004/banerjee2004aaai-performance/

Abstract

Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is notice-ably lacking. The main reasons for such lack of trust in complete automation are scalability and non-existence of reasonable guarantees in the performance of self-adapting software. In this paper we address the latter issue in the context of learning agents in a Multiagent System (MAS). Performance guarantees for most exist-ing on-line Multiagent Learning (MAL) algorithms are realizable only in the limit, thereby seriously limiting its practical utility. Our goal is to provide certain mean-ingful guarantees about the performance of a learner in a MAS, while it is learning. In particular, we present a novel MAL algorithm that (i) converges to a best re-sponse against stationary opponents, (ii) converges to a Nash equilibrium in self-play and (iii) achieves a con-stant bounded expected regret at any time (no-average-regret asymptotically) in arbitrary sized general-sum games with non-negative payoffs, and against any num-ber of opponents.

PDF AAAI Semantic Scholar

Cite

Text

Banerjee and Peng. "Performance Bounded Reinforcement Learning in Strategic Interactions." AAAI Conference on Artificial Intelligence, 2004.

Markdown

[Banerjee and Peng. "Performance Bounded Reinforcement Learning in Strategic Interactions." AAAI Conference on Artificial Intelligence, 2004.](https://mlanthology.org/aaai/2004/banerjee2004aaai-performance/)

BibTeX

@inproceedings{banerjee2004aaai-performance,
  title     = {{Performance Bounded Reinforcement Learning in Strategic Interactions}},
  author    = {Banerjee, Bikramjit and Peng, Jing},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2004},
  pages     = {2-7},
  url       = {https://mlanthology.org/aaai/2004/banerjee2004aaai-performance/}
}