How Does LLM Compression Affect Weight Exfiltration Attacks?

Abstract

As frontier AIs become more powerful and costly to develop, adversaries have increasing incentives to mount weight exfiltration attacks. In this work, we explore how advanced compression techniques can significantly heighten this risk, particularly for large language models (LLMs). By tailoring compression specifically for exfiltration rather than inference, we demonstrate that attackers could achieve up to $16\times$ compression with minimal trade-offs, reducing exfiltration time from months to days. To quantify this risk, we propose a model for exfiltration success and show how compression tactics can greatly reduce exfiltration time and increase attack success rate. With AIs becoming increasingly valuable to industry and government, our findings underscore the urgent need to develop defenses for weight exfiltration and secure model weights.

Cite

Text

Brown and Mazeika. "How Does LLM Compression Affect Weight Exfiltration Attacks?." NeurIPS 2024 Workshops: SoLaR, 2024.

Markdown

[Brown and Mazeika. "How Does LLM Compression Affect Weight Exfiltration Attacks?." NeurIPS 2024 Workshops: SoLaR, 2024.](https://mlanthology.org/neuripsw/2024/brown2024neuripsw-llm/)

BibTeX

@inproceedings{brown2024neuripsw-llm,
  title     = {{How Does LLM Compression Affect Weight Exfiltration Attacks?}},
  author    = {Brown, Davis and Mazeika, Mantas},
  booktitle = {NeurIPS 2024 Workshops: SoLaR},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/brown2024neuripsw-llm/}
}