Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage

Rashid, Md. Rafi Ur; Liu, Jing; Koike-Akino, Toshiaki; Wang, Ye; Mehnaz, Shagufta

doi:10.1609/AAAI.V39I19.34218

Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage

Md. Rafi Ur Rashid, Jing Liu, Toshiaki Koike-Akino, Ye Wang, Shagufta Mehnaz

AAAI 2025 pp. 20139-20147

doi:10.1609/AAAI.V39I19.34218 /aaai/2025/rashid2025aaai-forget/

Abstract

Fine-tuning large language models on private data for downstream applications poses significant privacy risks in potentially exposing sensitive information. Several popular community platforms now offer convenient distribution of a large variety of pre-trained models, allowing anyone to publish without rigorous verification. This scenario creates a privacy threat, as pre-trained models can be intentionally crafted to compromise the privacy of fine-tuning datasets. In this study, we introduce a novel poisoning technique that uses model-unlearning as an attack tool. This approach manipulates a pre-trained language model to increase the leakage of private data during the fine-tuning process. Our method enhances both membership inference and data extraction attacks while preserving model utility. Experimental results across different models, datasets, and fine-tuning setups demonstrate that our attacks significantly surpass baseline performance. This work serves as a cautionary note for users who download pretrained models from unverified sources, highlighting the potential risks involved.

PDF AAAI Semantic Scholar

Cite

Text

Rashid et al. "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I19.34218

Markdown

[Rashid et al. "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/rashid2025aaai-forget/) doi:10.1609/AAAI.V39I19.34218

BibTeX

@inproceedings{rashid2025aaai-forget,
  title     = {{Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage}},
  author    = {Rashid, Md. Rafi Ur and Liu, Jing and Koike-Akino, Toshiaki and Wang, Ye and Mehnaz, Shagufta},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {20139-20147},
  doi       = {10.1609/AAAI.V39I19.34218},
  url       = {https://mlanthology.org/aaai/2025/rashid2025aaai-forget/}
}