A Parameter-Free Hedging Algorithm
Abstract
We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a major barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. In addition, we introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters, according to previous notions of regret.
Cite
Text
Chaudhuri et al. "A Parameter-Free Hedging Algorithm." Neural Information Processing Systems, 2009.Markdown
[Chaudhuri et al. "A Parameter-Free Hedging Algorithm." Neural Information Processing Systems, 2009.](https://mlanthology.org/neurips/2009/chaudhuri2009neurips-parameterfree/)BibTeX
@inproceedings{chaudhuri2009neurips-parameterfree,
title = {{A Parameter-Free Hedging Algorithm}},
author = {Chaudhuri, Kamalika and Freund, Yoav and Hsu, Daniel J.},
booktitle = {Neural Information Processing Systems},
year = {2009},
pages = {297-305},
url = {https://mlanthology.org/neurips/2009/chaudhuri2009neurips-parameterfree/}
}