Auditing Empirical Privacy Protection of Private LLM Adaptations

Lorenzo Rossi, Bartłomiej Marek, Vincent Hanke, Xun Wang, Michael Backes, Adam Dziedzic, Franziska Boenisch

NeurIPSW 2024

/neuripsw/2024/rossi2024neuripsw-auditing/

Abstract

Recent work has applied differential privacy (DP) methods to adapt large language models (LLMs) for sensitive applications. While DP offers theoretical privacy guarantees, their practical implications for LLM adaptations remain uncertain. This uncertainty arises from LLM pretraining, where overlap and interdependencies between pretraining and adaptation data can impact privacy leakage despite DP adaptation efforts. To analyze the issue from a practical standpoint, we thoroughly investigate privacy risks under "private" adaptations in LLMs. Relying on the latest privacy attacks, such as robust membership inference, we study the actual privacy risks for the pretraining and adaptation data. We benchmark the privacy risks by systematically varying the distribution of adaptation data, ranging from data perfectly overlapping with the pretraining set through in-distribution (IID) scenarios to entirely out-of-distribution (OOD) examples. Additionally, we evaluate how different kinds of adaptation methods and different privacy regimes impact the vulnerability. Our results reveal that distribution shifts significantly affect the vulnerability to privacy attacks: the closer the distribution of the adaptation data is to the pretraining distribution, the higher its practical privacy risk, even when there is no overlap between pretraining and adaptation data. We find that the highest empirical privacy protection is achieved for OOD data using parameter-efficient fine-tuning (PEFT) methods, such as LoRA. Surprisingly, when considering data from the same distribution, using the pertaining data for adaptations exhibits a similar privacy leakage as the corresponding validation data. To effectively prevent privacy leakage, it is required to train the adaptations with strict differential privacy protection. Finally, our results show that private adaptations, especially done with prefix tuning, can also decrease the empirical leakage from the pretraining data.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Rossi et al. "Auditing Empirical Privacy Protection of Private LLM Adaptations." NeurIPS 2024 Workshops: SafeGenAi, 2024.

Markdown

[Rossi et al. "Auditing Empirical Privacy Protection of Private LLM Adaptations." NeurIPS 2024 Workshops: SafeGenAi, 2024.](https://mlanthology.org/neuripsw/2024/rossi2024neuripsw-auditing/)

BibTeX

@inproceedings{rossi2024neuripsw-auditing,
  title     = {{Auditing Empirical Privacy Protection of Private LLM Adaptations}},
  author    = {Rossi, Lorenzo and Marek, Bartłomiej and Hanke, Vincent and Wang, Xun and Backes, Michael and Dziedzic, Adam and Boenisch, Franziska},
  booktitle = {NeurIPS 2024 Workshops: SafeGenAi},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/rossi2024neuripsw-auditing/}
}