Simulation-Based Bayesian Inference from Privacy Protected Data
Abstract
Many modern statistical analysis and machine learning applications require training models on sensitive user data. Under a formal definition of privacy protection, differentially private algorithms inject calibrated noise into the confidential data or during the data analysis process to produce privacy-protected datasets or queries. However, restricting access to only privatized data during statistical analysis makes it computationally challenging to make valid statistical inferences. In this work, we propose simulation-based inference methods from privacy-protected datasets. In addition to sequential Monte Carlo approximate Bayesian computation, we adopt neural conditional density estimators as a flexible family of distributions to approximate the posterior distribution of model parameters given the observed private query results. We illustrate our methods on discrete time-series data under an infectious disease model and with ordinary linear regression models. Illustrating the privacy-utility trade-off, our experiments and analysis demonstrate the necessity and feasibility of designing valid statistical inference procedures to correct for biases introduced by the privacy-protection mechanisms.
Cite
Text
Xiong et al. "Simulation-Based Bayesian Inference from Privacy Protected Data." Transactions on Machine Learning Research, 2025.Markdown
[Xiong et al. "Simulation-Based Bayesian Inference from Privacy Protected Data." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/xiong2025tmlr-simulationbased/)BibTeX
@article{xiong2025tmlr-simulationbased,
title = {{Simulation-Based Bayesian Inference from Privacy Protected Data}},
author = {Xiong, Yifei and Ju, Nianqiao and Zhang, Sanguo},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/xiong2025tmlr-simulationbased/}
}