Bandit Theory Meets Compressed Sensing for High Dimensional Stochastic Linear Bandit
Abstract
We consider a linear stochastic bandit problem where the dimension K of the unknown parameter heta is larger than the sampling budget n. Since usual linear bandit algorithms have a regret in O(K\sqrt{n}), it is in general impossible to obtain a sub-linear regret without further assumption. In this paper we make the assumption that heta is S-sparse, i.e. has at most S-non-zero components, and that the space of arms is the unit ball for the ||.||_2 norm. We combine ideas from Compressed Sensing and Bandit Theory to derive an algorithm with a regret bound in O(S\sqrt{n}). We detail an application to the problem of optimizing a function that depends on many variables but among which only a small number of them (initially unknown) are relevant.
Cite
Text
Carpentier and Munos. "Bandit Theory Meets Compressed Sensing for High Dimensional Stochastic Linear Bandit." Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012.Markdown
[Carpentier and Munos. "Bandit Theory Meets Compressed Sensing for High Dimensional Stochastic Linear Bandit." Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012.](https://mlanthology.org/aistats/2012/carpentier2012aistats-bandit/)BibTeX
@inproceedings{carpentier2012aistats-bandit,
title = {{Bandit Theory Meets Compressed Sensing for High Dimensional Stochastic Linear Bandit}},
author = {Carpentier, Alexandra and Munos, Remi},
booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
year = {2012},
pages = {190-198},
volume = {22},
url = {https://mlanthology.org/aistats/2012/carpentier2012aistats-bandit/}
}