The Price of Differential Privacy for Online Learning
Abstract
We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.
Cite
Text
Agarwal and Singh. "The Price of Differential Privacy for Online Learning." International Conference on Machine Learning, 2017.Markdown
[Agarwal and Singh. "The Price of Differential Privacy for Online Learning." International Conference on Machine Learning, 2017.](https://mlanthology.org/icml/2017/agarwal2017icml-price/)BibTeX
@inproceedings{agarwal2017icml-price,
title = {{The Price of Differential Privacy for Online Learning}},
author = {Agarwal, Naman and Singh, Karan},
booktitle = {International Conference on Machine Learning},
year = {2017},
pages = {32-40},
volume = {70},
url = {https://mlanthology.org/icml/2017/agarwal2017icml-price/}
}