Bounded Space Differentially Private Quantiles

Abstract

Estimating the quantiles of a large dataset is a fundamental problem in both the streaming algorithms literature and the differential privacy literature. However, all existing private mechanisms for distribution-independent quantile computation require space at least linear in the input size $n$. In this work, we devise a differentially private algorithm for the quantile estimation problem, with strongly sublinear space complexity, in the one-shot and continual observation settings. Our basic mechanism estimates any $\alpha$-approximate quantile of a length-$n$ stream over a data universe $\mathcal{X}$ with probability $1-\beta$ using $O\left( \frac{\log (|\mathcal{X}|/\beta) \log (\alpha \epsilon n)}{\alpha \epsilon} \right)$ space while satisfying $\epsilon$-differential privacy at a single time point. Our approach builds upon deterministic streaming algorithms for non-private quantile estimation instantiating the exponential mechanism using a utility function defined on sketch items, while (privately) sampling from intervals defined by the sketch. We also present another algorithm based on histograms that is especially well-suited to the multiple quantiles case. We implement our algorithms and experimentally evaluate them on synthetic and real-world datasets.

Cite

Text

Alabi et al. "Bounded Space Differentially Private Quantiles." Transactions on Machine Learning Research, 2023.

Markdown

[Alabi et al. "Bounded Space Differentially Private Quantiles." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/alabi2023tmlr-bounded/)

BibTeX

@article{alabi2023tmlr-bounded,
  title     = {{Bounded Space Differentially Private Quantiles}},
  author    = {Alabi, Daniel and Ben-Eliezer, Omri and Chaturvedi, Anamay},
  journal   = {Transactions on Machine Learning Research},
  year      = {2023},
  url       = {https://mlanthology.org/tmlr/2023/alabi2023tmlr-bounded/}
}