A KL-LUCB Algorithm for Large-Scale Crowdsourcing
Abstract
This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest.
Cite
Text
Tanczos et al. "A KL-LUCB Algorithm for Large-Scale Crowdsourcing." Neural Information Processing Systems, 2017.Markdown
[Tanczos et al. "A KL-LUCB Algorithm for Large-Scale Crowdsourcing." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/tanczos2017neurips-kllucb/)BibTeX
@inproceedings{tanczos2017neurips-kllucb,
title = {{A KL-LUCB Algorithm for Large-Scale Crowdsourcing}},
author = {Tanczos, Ervin and Nowak, Robert and Mankoff, Bob},
booktitle = {Neural Information Processing Systems},
year = {2017},
pages = {5894-5903},
url = {https://mlanthology.org/neurips/2017/tanczos2017neurips-kllucb/}
}