Playing the Matching-Shoulders Lob-Pass Game with Logarithmic Regret

Abstract

The best previous algorithm for the matching shoulders lob-pass game, Abe and Takeuchi's (1993) ARTHUR, suffered O(t 1=2 ) regret. We prove that this is the best possible performance for any algorithm that works by accurately estimating the opponent's payoff lines. Then we describe an algorithm which beats that bound and meets the information-theoretic lower bound of O(log t) regret by converging to the best lob rate without accurately estimating the payoff lines. The noise-tolerant binary search procedure that we develop is of independent interest.

Cite

Text

Kilian et al. "Playing the Matching-Shoulders Lob-Pass Game with Logarithmic Regret." Annual Conference on Computational Learning Theory, 1994. doi:10.1145/180139.181094

Markdown

[Kilian et al. "Playing the Matching-Shoulders Lob-Pass Game with Logarithmic Regret." Annual Conference on Computational Learning Theory, 1994.](https://mlanthology.org/colt/1994/kilian1994colt-playing/) doi:10.1145/180139.181094

BibTeX

@inproceedings{kilian1994colt-playing,
  title     = {{Playing the Matching-Shoulders Lob-Pass Game with Logarithmic Regret}},
  author    = {Kilian, Joe and Lang, Kevin J. and Pearlmutter, Barak A.},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {1994},
  pages     = {159-164},
  doi       = {10.1145/180139.181094},
  url       = {https://mlanthology.org/colt/1994/kilian1994colt-playing/}
}