Efficient Kernelized UCB for Contextual Bandits

Abstract

In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a $\mathcal{O}(CT^3)$ complexity where $T$ is the horizon and the constant $C$ is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nyström approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of $\mathcal{O}(CTm^2)$ where $m$ is the number of Nyström points. To recover the same regret as the standard kernelized UCB algorithm, $m$ needs to be of order of the effective dimension of the problem, which is at most $\mathcal{O}(\sqrt{T})$ and nearly constant in some cases.

Cite

Text

Zenati et al. "Efficient Kernelized UCB for Contextual Bandits." Artificial Intelligence and Statistics, 2022.

Markdown

[Zenati et al. "Efficient Kernelized UCB for Contextual Bandits." Artificial Intelligence and Statistics, 2022.](https://mlanthology.org/aistats/2022/zenati2022aistats-efficient/)

BibTeX

@inproceedings{zenati2022aistats-efficient,
  title     = {{Efficient Kernelized UCB for Contextual Bandits}},
  author    = {Zenati, Houssam and Bietti, Alberto and Diemert, Eustache and Mairal, Julien and Martin, Matthieu and Gaillard, Pierre},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2022},
  pages     = {5689-5720},
  volume    = {151},
  url       = {https://mlanthology.org/aistats/2022/zenati2022aistats-efficient/}
}