The Price of Differential Privacy for Online Learning

Abstract

We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.

Cite

Text

Agarwal and Singh. "The Price of Differential Privacy for Online Learning." International Conference on Machine Learning, 2017.

Markdown

[Agarwal and Singh. "The Price of Differential Privacy for Online Learning." International Conference on Machine Learning, 2017.](https://mlanthology.org/icml/2017/agarwal2017icml-price/)

BibTeX

@inproceedings{agarwal2017icml-price,
  title     = {{The Price of Differential Privacy for Online Learning}},
  author    = {Agarwal, Naman and Singh, Karan},
  booktitle = {International Conference on Machine Learning},
  year      = {2017},
  pages     = {32-40},
  volume    = {70},
  url       = {https://mlanthology.org/icml/2017/agarwal2017icml-price/}
}