Differentially Private Learning Needs Better Model Initialization and Self-Distillation
Abstract
Differentially private SGD (DPSGD) enables privacy-preserving training of language models, but often reduces utility, diversity, and linguistic quality. We introduce DPRefine, a three-phase method that initializes a model using data synthesis from a small pre-trained LM with rigorous filtering, applies DP finetuning on private data, and performs self-distillation to refine outputs. This approach significantly outperforms vanilla DPSGD, with AlpacaEval preferring DPRefine's generations in 78.4% of cases across all datasets. Our analysis reveals that DPRefine reduces linguistic errors in generated text by 84.0%, mitigating grammar and spelling errors, commonly associated with DPSGD. It also reduces inconsistencies of non-private models, such as hallucinated details and misattributed quotes. We find that small models like GPT-2 can be effective for initialization and distillation, highlighting their potential in enabling scalable and efficient deployment of privacy-preserving language.
Cite
Text
Ngong et al. "Differentially Private Learning Needs Better Model Initialization and Self-Distillation." NeurIPS 2024 Workshops: SoLaR, 2024.Markdown
[Ngong et al. "Differentially Private Learning Needs Better Model Initialization and Self-Distillation." NeurIPS 2024 Workshops: SoLaR, 2024.](https://mlanthology.org/neuripsw/2024/ngong2024neuripsw-differentially/)BibTeX
@inproceedings{ngong2024neuripsw-differentially,
title = {{Differentially Private Learning Needs Better Model Initialization and Self-Distillation}},
author = {Ngong, Ivoline C. and Near, Joseph and Mireshghallah, Niloofar},
booktitle = {NeurIPS 2024 Workshops: SoLaR},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/ngong2024neuripsw-differentially/}
}