Online Linear Optimization with Many Hints

Abstract

We study an online linear optimization (OLO) problem in which the learner is provided access to $K$ ``hint'' vectors in each round prior to making a decision. In this setting, we devise an algorithm that obtains logarithmic regret whenever there exists a convex combination of the $K$ hints that has positive correlation with the cost vectors. This significantly extends prior work that considered only the case $K=1$. To accomplish this, we develop a way to combine many arbitrary OLO algorithms to obtain regret only a logarithmically worse factor than the minimum regret of the original algorithms in hindsight; this result is of independent interest.

Cite

Text

Bhaskara et al. "Online Linear Optimization with Many Hints." Neural Information Processing Systems, 2020.

Markdown

[Bhaskara et al. "Online Linear Optimization with Many Hints." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/bhaskara2020neurips-online-a/)

BibTeX

@inproceedings{bhaskara2020neurips-online-a,
  title     = {{Online Linear Optimization with Many Hints}},
  author    = {Bhaskara, Aditya and Cutkosky, Ashok and Kumar, Ravi and Purohit, Manish},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/bhaskara2020neurips-online-a/}
}