Memorization and Privacy Risks in Domain-Specific Large Language Models
Abstract
Recent literature has explored the potential of fine-tuning LLMs on domain-specific corpora to improve performance on respective domains. However, the risk of memorizing and leaking sensitive information when these models learn from third-party custom fine-tuning data poses significant potential harm to individuals and organizations. To this end, as well as the widespread use of domain-specific LLMs in many high-stake domains, it is imperative to explore whether, and to what degree, domain-specific LLMs memorize fine-tuning data. Through a series of experiment, these models exhibit significant capacities for memorizing fine-tuning data, which result in significant privacy leakage. Furthermore, our investigations reveal that randomly removing certain words and rephrasing prompts show promising performance in mitigating memorization.
Cite
Text
Yang et al. "Memorization and Privacy Risks in Domain-Specific Large Language Models." ICLR 2024 Workshops: R2-FM, 2024.Markdown
[Yang et al. "Memorization and Privacy Risks in Domain-Specific Large Language Models." ICLR 2024 Workshops: R2-FM, 2024.](https://mlanthology.org/iclrw/2024/yang2024iclrw-memorization/)BibTeX
@inproceedings{yang2024iclrw-memorization,
title = {{Memorization and Privacy Risks in Domain-Specific Large Language Models}},
author = {Yang, Xinyu and Wen, Zichen and Qu, Wenjie and Chen, Zhaorun and Xiang, Zhiying and Chen, Beidi and Yao, Huaxiu},
booktitle = {ICLR 2024 Workshops: R2-FM},
year = {2024},
url = {https://mlanthology.org/iclrw/2024/yang2024iclrw-memorization/}
}