Open Problem: What Is the Complexity of Joint Differential Privacy in Linear Contextual Bandits?
Abstract
Contextual bandits serve as a theoretical framework to design recommender systems, which often rely on user-sensitive data, making privacy a critical concern. However, a significant gap remains between the known upper and lower bounds on the regret achievable in linear contextual bandits under Joint Differential Privacy (JDP), which is a popular privacy definition used in this setting. In particular, the best regret upper bound is known to be $O\left(d \sqrt{T} \log(T) + \textcolor{blue}{d^{3/4} \sqrt{T \log(1/\delta)} / \sqrt{\epsilon}} \right)$, while the lower bound is $\Omega \left(\sqrt{d T \log(K)} + \textcolor{blue}{d/(\epsilon + \delta)}\right)$. We discuss the recent progress on this problem, both from the algorithm design and lower bound techniques, and posit the open questions.
Cite
Text
Azize and Basu. "Open Problem: What Is the Complexity of Joint Differential Privacy in Linear Contextual Bandits?." Conference on Learning Theory, 2024.Markdown
[Azize and Basu. "Open Problem: What Is the Complexity of Joint Differential Privacy in Linear Contextual Bandits?." Conference on Learning Theory, 2024.](https://mlanthology.org/colt/2024/azize2024colt-open/)BibTeX
@inproceedings{azize2024colt-open,
title = {{Open Problem: What Is the Complexity of Joint Differential Privacy in Linear Contextual Bandits?}},
author = {Azize, Achraf and Basu, Debabrota},
booktitle = {Conference on Learning Theory},
year = {2024},
pages = {5306-5311},
volume = {247},
url = {https://mlanthology.org/colt/2024/azize2024colt-open/}
}