(Nearly) Optimal Differentially Private Stochastic Multi-Arm Bandits
Abstract
We study the problem of private stochastic multi-arm bandits. Our notion of privacy is the same as some of the earlier works in the general area of private online learning [13, 17, 24]. We design algorithms that are i) differentially private, and ii) have regret guarantees that (almost) match the regret guarantees for the best non-private algorithms (e.g., upper confidence bound sampling and Thompson sampling). Moreover, through our experiments on both simulated and real datasets, we empirically show the effectiveness of our algorithms.
Cite
Text
Mishra and Thakurta. "(Nearly) Optimal Differentially Private Stochastic Multi-Arm Bandits." Conference on Uncertainty in Artificial Intelligence, 2015.Markdown
[Mishra and Thakurta. "(Nearly) Optimal Differentially Private Stochastic Multi-Arm Bandits." Conference on Uncertainty in Artificial Intelligence, 2015.](https://mlanthology.org/uai/2015/mishra2015uai-nearly/)BibTeX
@inproceedings{mishra2015uai-nearly,
title = {{(Nearly) Optimal Differentially Private Stochastic Multi-Arm Bandits}},
author = {Mishra, Nikita and Thakurta, Abhradeep},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2015},
pages = {592-601},
url = {https://mlanthology.org/uai/2015/mishra2015uai-nearly/}
}