Regret Transfer and Parameter Optimization

AAAI 2014 pp. 594-601

doi:10.1609/AAAI.V28I1.8832 /aaai/2014/brown2014aaai-regret/

Abstract

Regret matching is a widely-used algorithm for learning how to act. We begin by proving that regrets on actions in one setting (game) can be transferred to warm start the regrets for solving a different setting with same structure but different payoffs that can be written as a function of parameters. We prove how this can be done by carefully discounting the prior regrets. This provides, to our knowledge, the first principled warm-starting method for no-regret learning. It also extends to warm-starting the widely-adopted counterfactual regret minimization (CFR) algorithm for large incomplete-information games; we show this experimentally as well. We then study optimizing a parameter vector for a player in a two-player zero-sum game (e.g., optimizing bet sizes to use in poker). We propose a custom gradient descent algorithm that provably finds a locally optimal parameter vector while leveraging our warm-start theory to significantly save regret-matching iterations at each step. It optimizes the parameter vector while simultaneously finding an equilibrium. We present experiments in no-limit Leduc Hold'em and no-limit Texas Hold'em to optimize bet sizing. This amounts to the first action abstraction algorithm (algorithm for selecting a small number of discrete actions to use from a continuum of actions---a key preprocessing step for solving large games using current equilibrium-finding algorithms) with convergence guarantees for extensive-form games.

PDF AAAI Semantic Scholar

Cite

Text

Brown and Sandholm. "Regret Transfer and Parameter Optimization." AAAI Conference on Artificial Intelligence, 2014. doi:10.1609/AAAI.V28I1.8832

Markdown

[Brown and Sandholm. "Regret Transfer and Parameter Optimization." AAAI Conference on Artificial Intelligence, 2014.](https://mlanthology.org/aaai/2014/brown2014aaai-regret/) doi:10.1609/AAAI.V28I1.8832

BibTeX

@inproceedings{brown2014aaai-regret,
  title     = {{Regret Transfer and Parameter Optimization}},
  author    = {Brown, Noam and Sandholm, Tuomas},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2014},
  pages     = {594-601},
  doi       = {10.1609/AAAI.V28I1.8832},
  url       = {https://mlanthology.org/aaai/2014/brown2014aaai-regret/}
}