Optimization, Learning, and Games with Predictable Sequences

Abstract

We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences. First, we recover the Mirror-Prox algorithm, prove an extension to Holder-smooth functions, and apply the results to saddle-point type problems. Second, we prove that a version of Optimistic Mirror Descent (which has a close relation to the Exponential Weights algorithm) can be used by two strongly-uncoupled players in a finite zero-sum matrix game to converge to the minimax equilibrium at the rate of O(log T / T). This addresses a question of Daskalakis et al, 2011. Further, we consider a partial information version of the problem. We then apply the results to approximate convex programming and show a simple algorithm for the approximate Max-Flow problem.

Cite

Text

Rakhlin and Sridharan. "Optimization, Learning, and Games with Predictable Sequences." Neural Information Processing Systems, 2013.

Markdown

[Rakhlin and Sridharan. "Optimization, Learning, and Games with Predictable Sequences." Neural Information Processing Systems, 2013.](https://mlanthology.org/neurips/2013/rakhlin2013neurips-optimization/)

BibTeX

@inproceedings{rakhlin2013neurips-optimization,
  title     = {{Optimization, Learning, and Games with Predictable Sequences}},
  author    = {Rakhlin, Sasha and Sridharan, Karthik},
  booktitle = {Neural Information Processing Systems},
  year      = {2013},
  pages     = {3066-3074},
  url       = {https://mlanthology.org/neurips/2013/rakhlin2013neurips-optimization/}
}