No-Regret Algorithms for Heavy-Tailed Linear Bandits

Abstract

We analyze the problem of linear bandits under heavy tailed noise. Most of of the work on linear bandits has been based on the assumption of bounded or sub-Gaussian noise. However, this assumption is often violated in common scenarios such as financial markets. We present two algorithms to tackle this problem: one based on dynamic truncation and one based on a median of means estimator. We show that, when the noise admits admits only a 1 + εmoment, these algorithms are still able to achieve regret in \widetildeO(T^\frac2 + ε2(1 + ε)) and \widetildeO(T^\frac1+ 2ε1 + 3 ε) respectively. In particular, they guarantee sublinear regret as long as the noise has finite variance. We also present empirical results showing that our algorithms achieve a better performance than the current state of the art for bounded noise when the L_∞bound on the noise is large yet the 1 + εmoment of the noise is small.

Cite

Text

Medina and Yang. "No-Regret Algorithms for Heavy-Tailed Linear Bandits." International Conference on Machine Learning, 2016.

Markdown

[Medina and Yang. "No-Regret Algorithms for Heavy-Tailed Linear Bandits." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/medina2016icml-noregret/)

BibTeX

@inproceedings{medina2016icml-noregret,
  title     = {{No-Regret Algorithms for Heavy-Tailed Linear Bandits}},
  author    = {Medina, Andres Munoz and Yang, Scott},
  booktitle = {International Conference on Machine Learning},
  year      = {2016},
  pages     = {1642-1650},
  volume    = {48},
  url       = {https://mlanthology.org/icml/2016/medina2016icml-noregret/}
}