Online Learning Under Delayed Feedback

Abstract

Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.

Cite

Text

Joulani et al. "Online Learning Under Delayed Feedback." International Conference on Machine Learning, 2013.

Markdown

[Joulani et al. "Online Learning Under Delayed Feedback." International Conference on Machine Learning, 2013.](https://mlanthology.org/icml/2013/joulani2013icml-online/)

BibTeX

@inproceedings{joulani2013icml-online,
  title     = {{Online Learning Under Delayed Feedback}},
  author    = {Joulani, Pooria and Gyorgy, Andras and Szepesvari, Csaba},
  booktitle = {International Conference on Machine Learning},
  year      = {2013},
  pages     = {1453-1461},
  volume    = {28},
  url       = {https://mlanthology.org/icml/2013/joulani2013icml-online/}
}