Online Distribution Learning with Local Privacy Constraints

Abstract

We study the problem of online conditional distribution estimation with \emph{unbounded} label sets under local differential privacy. The problem may be succinctly stated as follows. Let $\mathcal{F}$ be a distribution-valued function class with an unbounded label set. Our aim is to estimate an \emph{unknown} function $f\in \mathcal{F}$ in an online fashion. More precisely, at time $t$, given a sample ${\mathbf{x}}_t$, we generate an estimate of $f({\mathbf{x}}_t)$ using only a \emph{privatized} version of the true \emph{labels} sampled from $f({\mathbf{x}}_t)$. The objective is to minimize the cumulative KL-risk of a finite horizon $T$. We show that under $(\epsilon,0)$-local differential privacy for the labels, the KL-risk equals $\tilde{\Theta}(\frac{1}{\epsilon}\sqrt{KT}),$ up to poly-logarithmic factors, where $K=|\mathcal{F}|$. This result significantly differs from the $\tilde{\Theta}(\sqrt{T\log K})$ bound derived in Wu et al., (2023a) for \emph{bounded} label sets. As a side-result, our approach recovers a nearly tight upper bound for the hypothesis selection problem of Gopi et al., (2020), which has only been established for the \emph{batch} setting.

Cite

Text

Sima et al. "Online Distribution Learning with Local Privacy Constraints." Artificial Intelligence and Statistics, 2024.

Markdown

[Sima et al. "Online Distribution Learning with Local Privacy Constraints." Artificial Intelligence and Statistics, 2024.](https://mlanthology.org/aistats/2024/sima2024aistats-online/)

BibTeX

@inproceedings{sima2024aistats-online,
  title     = {{Online Distribution Learning with Local Privacy Constraints}},
  author    = {Sima, Jin and Wu, Changlong and Milenkovic, Olgica and Szpankowski, Wojciech},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2024},
  pages     = {460-468},
  volume    = {238},
  url       = {https://mlanthology.org/aistats/2024/sima2024aistats-online/}
}