A New Framework for Measuring Re-Identification Risk
Abstract
Compact user representations (such as embeddings) form the backbone of personalization services. In this work, we present a new theoretical framework to measure re-identification risk in such user representations. Our framework, based on hypothesis testing, formally bounds the probability that an attacker may be able to obtain the identity of a user from their representation. As an application, we show how our framework is general enough to model important real-world applications such as the Chrome's Topics API for interest-based advertising. We complement our theoretical bounds by showing provably good attack algorithms for re-identification that we use to estimate the re-identification risk in the Topics API. We believe this work provides a rigorous and interpretable notion of re-identification risk and a framework to measure it that can be used to inform real-world applications.
Cite
Text
Carey et al. "A New Framework for Measuring Re-Identification Risk." NeurIPS 2023 Workshops: RegML, 2023.Markdown
[Carey et al. "A New Framework for Measuring Re-Identification Risk." NeurIPS 2023 Workshops: RegML, 2023.](https://mlanthology.org/neuripsw/2023/carey2023neuripsw-new/)BibTeX
@inproceedings{carey2023neuripsw-new,
title = {{A New Framework for Measuring Re-Identification Risk}},
author = {Carey, Cj and Dick, Travis and Epasto, Alessandro and Javanmard, Adel and Karlin, Josh and Kumar, Shankar and Medina, Andres Munoz and Mirrokni, Vahab and Nunes, Gabriel and Vassilvitskii, Sergei and Zhong, Peilin},
booktitle = {NeurIPS 2023 Workshops: RegML},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/carey2023neuripsw-new/}
}