Online Learning with a Hint

Abstract

We study a variant of online linear optimization where the player receives a hint about the loss function at the beginning of each round. The hint is given in the form of a vector that is weakly correlated with the loss vector on that round. We show that the player can benefit from such a hint if the set of feasible actions is sufficiently round. Specifically, if the set is strongly convex, the hint can be used to guarantee a regret of O(log(T)), and if the set is q-uniformly convex for q\in(2,3), the hint can be used to guarantee a regret of o(sqrt{T}). In contrast, we establish Omega(sqrt{T}) lower bounds on regret when the set of feasible actions is a polyhedron.

Cite

Text

Dekel et al. "Online Learning with a Hint." Neural Information Processing Systems, 2017.

Markdown

[Dekel et al. "Online Learning with a Hint." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/dekel2017neurips-online/)

BibTeX

@inproceedings{dekel2017neurips-online,
  title     = {{Online Learning with a Hint}},
  author    = {Dekel, Ofer and Flajolet, Arthur and Haghtalab, Nika and Jaillet, Patrick},
  booktitle = {Neural Information Processing Systems},
  year      = {2017},
  pages     = {5299-5308},
  url       = {https://mlanthology.org/neurips/2017/dekel2017neurips-online/}
}