Do LLM Agents Have Regret? a Case Study in Online Learning and Games

Abstract

Large language models (LLMs) have been increasingly employed for (interactive) decision-making, via the development of LLM-based autonomous agents. Despite their emerging successes, the performance of LLM agents in decision-making has not been fully investigated through rigorous metrics, especially in the multi-agent setting when they interact with each other, a typical scenario in real-world LLM-agent applications. To better understand the limits of LLM agents in these interactive environments, we propose to study their interactions in benchmark decision-making settings of \emph{online learning} and \emph{games}, through the performance metric of \emph{regret}. We first empirically study the \emph{no-regret} behaviors of LLMs in canonical (non-stationary) online learning problems, as well as the emergence of equilibria when LLM agents interact through playing repeated games. We then provide theoretical insights into the no-regret behaviors of LLM agents, under certain assumptions on \emph{supervised} pre-training and \emph{rationality} model of human decision-makers who generate the data. Notably, we also identify (simple) cases where advanced LLMs such as GPT-4 fail to be no-regret. To promote the no-regret behaviors, we propose a novel \emph{unsupervised} training loss of \emph{regret-loss}, which, in contrast to the supervised pre-training loss, does not require the labels of (optimal) actions. We then establish the statistical guarantee of generalization bound for regret-loss minimization, followed by the optimization guarantee that minimizing such a loss may automatically lead to known no-regret learning algorithms. Our further experiments demonstrate the effectiveness of our regret-loss, especially in addressing the above ``regrettable'' cases.

Cite

Text

Park et al. "Do LLM Agents Have Regret? a Case Study in Online Learning and Games." ICLR 2024 Workshops: AGI, 2024.

Markdown

[Park et al. "Do LLM Agents Have Regret? a Case Study in Online Learning and Games." ICLR 2024 Workshops: AGI, 2024.](https://mlanthology.org/iclrw/2024/park2024iclrw-llm/)

BibTeX

@inproceedings{park2024iclrw-llm,
  title     = {{Do LLM Agents Have Regret? a Case Study in Online Learning and Games}},
  author    = {Park, Chanwoo and Liu, Xiangyu and Ozdaglar, Asuman E. and Zhang, Kaiqing},
  booktitle = {ICLR 2024 Workshops: AGI},
  year      = {2024},
  url       = {https://mlanthology.org/iclrw/2024/park2024iclrw-llm/}
}