Learning a Game by Paying the Agents

Abstract

We study the problem of learning the utility functions of no-regret learning agents in a repeated normal-form game. Differing from most prior literature, we introduce a principal with the power to observe the agents playing the game, send agents signals, and give agents *payments* as a function of their actions. We show that the principal can, using a number of rounds polynomial in the size of the game, learn the utility functions of all agents to any desired precision $\varepsilon > 0$, for *any* no-regret learning algorithms of the agents. Our main technique is to formulate a zero-sum game between the principal and the agents, where the principal chooses strategies among the set of all payment functions to minimize the agent's payoff. Finally, we discuss implications for the problem of *steering* agents. We introduce, using our utility-learning algorithm as a subroutine, the first algorithm for steering arbitrary no-regret learning agents to a desired equilibrium without prior knowledge of their utility functions.

Cite

Text

Zhang et al. "Learning a Game by Paying the Agents." International Conference on Learning Representations, 2026.

Markdown

[Zhang et al. "Learning a Game by Paying the Agents." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhang2026iclr-learning-e/)

BibTeX

@inproceedings{zhang2026iclr-learning-e,
  title     = {{Learning a Game by Paying the Agents}},
  author    = {Zhang, Brian Hu and Lin, Tao and Chen, Yiling and Sandholm, Tuomas},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/zhang2026iclr-learning-e/}
}