An Analysis of Ensemble Sampling

Abstract

Ensemble sampling serves as a practical approximation to Thompson sampling when maintaining an exact posterior distribution over model parameters is computationally intractable. In this paper, we establish a regret bound that ensures desirable behavior when ensemble sampling is applied to the linear bandit problem. This represents the first rigorous regret analysis of ensemble sampling and is made possible by leveraging information-theoretic concepts and novel analytic techniques that may prove useful beyond the scope of this paper.

Cite

Text

Qin et al. "An Analysis of Ensemble Sampling." Neural Information Processing Systems, 2022.

Markdown

[Qin et al. "An Analysis of Ensemble Sampling." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/qin2022neurips-analysis/)

BibTeX

@inproceedings{qin2022neurips-analysis,
  title     = {{An Analysis of Ensemble Sampling}},
  author    = {Qin, Chao and Wen, Zheng and Lu, Xiuyuan and Van Roy, Benjamin},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/qin2022neurips-analysis/}
}