Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model

Abstract

The turnstile continual release model of differential privacy captures scenarios where a privacy-preserving real-time analysis is sought for a dataset evolving through additions and deletions. In typical applications of real-time data analysis, both the length of the stream $T$ and the size of the universe $|\mathcal{U}|$ from which data come can be extremely large. This motivates the study of private algorithms in the turnstile setting using space sublinear in both $T$ and $|\mathcal{U}|$. In this paper, we give the first sublinear space differentially private algorithms for the fundamental problems of counting distinct elements in the turnstile streaming model. Our algorithm achieves, on arbitrary streams, $O_{\eta}(T^{1/3})$ space and additive error, and a $(1+\eta)$-relative approximation for all $\eta \in (0,1)$. Our result significantly improves upon the space requirements of the state-of-the-art algorithms for this problem, which is linear, approaching the known $\Omega(T^{1/4})$ additive error lower bound for arbitrary streams. Moreover, when a bound $W$ on the number of times an item appears in the stream is known, our algorithm provides $O_{\eta}(\sqrt{W})$ additive error, using $O_{\eta}(\sqrt{W})$ space. This additive error asymptotically matches that of prior work which required instead linear space. Our results address an open question posed by Jain et al. about designing low-memory mechanisms for this problem. We complement this results with a space lower bound for this problem, which shows that any algorithm that uses similar techniques must use space $\Omega(T^{1/3})$.

Cite

Text

Cummings et al. "Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Cummings et al. "Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/cummings2025icml-differentially/)

BibTeX

@inproceedings{cummings2025icml-differentially,
  title     = {{Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model}},
  author    = {Cummings, Rachel and Epasto, Alessandro and Mao, Jieming and Mukherjee, Tamalika and Ou, Tingting and Zhong, Peilin},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {11662-11691},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/cummings2025icml-differentially/}
}