Distributed Differential Privacy in Multi-Armed Bandits

Abstract

We consider the standard $K$-armed bandit problem under a distributed trust model of differential privacy (DP), which enables to guarantee privacy without a trustworthy server. Under this trust model, previous work largely focus on achieving privacy using a shuffle protocol, where a batch of users data are randomly permuted before sending to a central server. This protocol achieves ($\epsilon,\delta$) or approximate-DP guarantee by sacrificing an additive $O\!\left(\!\frac{K\log T\sqrt{\log(1/\delta)}}{\epsilon}\!\right)\!$ factor in $T$-step cumulative regret. In contrast, the optimal privacy cost to achieve a stronger ($\epsilon,0$) or pure-DP guarantee under the widely used central trust model is only $\Theta\!\left(\!\frac{K\log T}{\epsilon}\!\right)\!$, where, however, a trusted server is required. In this work, we aim to obtain a pure-DP guarantee under distributed trust model while sacrificing no more regret than that under central trust model. We achieve this by designing a generic bandit algorithm based on successive arm elimination, where privacy is guaranteed by corrupting rewards with an equivalent discrete Laplace noise ensured by a secure computation protocol. We numerically simulate regret performance of our algorithm, which corroborate our theoretical findings.

Cite

Text

Chowdhury and Zhou. "Distributed Differential Privacy in Multi-Armed Bandits." NeurIPS 2022 Workshops: TSRML, 2022.

Markdown

[Chowdhury and Zhou. "Distributed Differential Privacy in Multi-Armed Bandits." NeurIPS 2022 Workshops: TSRML, 2022.](https://mlanthology.org/neuripsw/2022/chowdhury2022neuripsw-distributed/)

BibTeX

@inproceedings{chowdhury2022neuripsw-distributed,
  title     = {{Distributed Differential Privacy in Multi-Armed Bandits}},
  author    = {Chowdhury, Sayak Ray and Zhou, Xingyu},
  booktitle = {NeurIPS 2022 Workshops: TSRML},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/chowdhury2022neuripsw-distributed/}
}