Better Algorithms for Stochastic Bandits with Adversarial Corruptions

Abstract

We study the stochastic multi-armed bandits problem in the presence of adversarial corruption. We present a new algorithm for this problem whose regret is nearly optimal, substantially improving upon previous work. Our algorithm is agnostic to the level of adversarial contamination and can tolerate a significant amount of corruption with virtually no degradation in performance.

Cite

Text

Gupta et al. "Better Algorithms for Stochastic Bandits with Adversarial Corruptions." Conference on Learning Theory, 2019.

Markdown

[Gupta et al. "Better Algorithms for Stochastic Bandits with Adversarial Corruptions." Conference on Learning Theory, 2019.](https://mlanthology.org/colt/2019/gupta2019colt-better/)

BibTeX

@inproceedings{gupta2019colt-better,
  title     = {{Better Algorithms for Stochastic Bandits with Adversarial Corruptions}},
  author    = {Gupta, Anupam and Koren, Tomer and Talwar, Kunal},
  booktitle = {Conference on Learning Theory},
  year      = {2019},
  pages     = {1562-1578},
  volume    = {99},
  url       = {https://mlanthology.org/colt/2019/gupta2019colt-better/}
}