Performance Bounded Reinforcement Learning in Strategic Interactions
Abstract
Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is notice-ably lacking. The main reasons for such lack of trust in complete automation are scalability and non-existence of reasonable guarantees in the performance of self-adapting software. In this paper we address the latter issue in the context of learning agents in a Multiagent System (MAS). Performance guarantees for most exist-ing on-line Multiagent Learning (MAL) algorithms are realizable only in the limit, thereby seriously limiting its practical utility. Our goal is to provide certain mean-ingful guarantees about the performance of a learner in a MAS, while it is learning. In particular, we present a novel MAL algorithm that (i) converges to a best re-sponse against stationary opponents, (ii) converges to a Nash equilibrium in self-play and (iii) achieves a con-stant bounded expected regret at any time (no-average-regret asymptotically) in arbitrary sized general-sum games with non-negative payoffs, and against any num-ber of opponents.
Cite
Text
Banerjee and Peng. "Performance Bounded Reinforcement Learning in Strategic Interactions." AAAI Conference on Artificial Intelligence, 2004.Markdown
[Banerjee and Peng. "Performance Bounded Reinforcement Learning in Strategic Interactions." AAAI Conference on Artificial Intelligence, 2004.](https://mlanthology.org/aaai/2004/banerjee2004aaai-performance/)BibTeX
@inproceedings{banerjee2004aaai-performance,
title = {{Performance Bounded Reinforcement Learning in Strategic Interactions}},
author = {Banerjee, Bikramjit and Peng, Jing},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2004},
pages = {2-7},
url = {https://mlanthology.org/aaai/2004/banerjee2004aaai-performance/}
}