A KL-LUCB Algorithm for Large-Scale Crowdsourcing

Abstract

This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest.

Cite

Text

Tanczos et al. "A KL-LUCB Algorithm for Large-Scale Crowdsourcing." Neural Information Processing Systems, 2017.

Markdown

[Tanczos et al. "A KL-LUCB Algorithm for Large-Scale Crowdsourcing." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/tanczos2017neurips-kllucb/)

BibTeX

@inproceedings{tanczos2017neurips-kllucb,
  title     = {{A KL-LUCB Algorithm for Large-Scale Crowdsourcing}},
  author    = {Tanczos, Ervin and Nowak, Robert and Mankoff, Bob},
  booktitle = {Neural Information Processing Systems},
  year      = {2017},
  pages     = {5894-5903},
  url       = {https://mlanthology.org/neurips/2017/tanczos2017neurips-kllucb/}
}