Adaptive Hedge

Abstract

Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods.

Cite

Text

Erven et al. "Adaptive Hedge." Neural Information Processing Systems, 2011.

Markdown

[Erven et al. "Adaptive Hedge." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/erven2011neurips-adaptive/)

BibTeX

@inproceedings{erven2011neurips-adaptive,
  title     = {{Adaptive Hedge}},
  author    = {Erven, Tim V. and Koolen, Wouter M. and Rooij, Steven D. and Grünwald, Peter},
  booktitle = {Neural Information Processing Systems},
  year      = {2011},
  pages     = {1656-1664},
  url       = {https://mlanthology.org/neurips/2011/erven2011neurips-adaptive/}
}