The Fundamental Limits of LLM Unlearning: Complexity-Theoretic Barriers and Provably Optimal Protocols

Abstract

Modern machine unlearning techniques for large language models (LLMs) re- main heuristic, lacking formal characterization of their fundamental computa- tional limits. We establish the first complexity-theoretic foundation for LLM un- learning, revealing intrinsic tradeoffs between efficiency, precision, and regulatory compliance. Our framework formalizes (ϵ, δ)-machine unlearning via measure- theoretic alignment of retrained and unlearned model distributions, then proves transformer-specific hardness results: exact unlearning is coNP-hard, while ap- proximate unlearning requires Ω(T 1−o(1)) time under the Exponential Time Hy- pothesis (ETH). We construct an optimal Recursive Sketch-and-Freeze pro- tocol achieving these bounds through differential privacy duality and Kronecker- product sketching. Crucially, we identify phase transitions in R´enyi unlearning cost at critical model scales (n ≈ d log k). These results provide (1) theoretical benchmarks for evaluating unlearning algorithms, (2) complexity-aware guide- lines for AI regulation, and (3) mathematically grounded verification tools for GDPR/CPRA compliance.

Cite

Text

Srivastava. "The Fundamental Limits of LLM Unlearning: Complexity-Theoretic Barriers and Provably Optimal Protocols." ICLR 2025 Workshops: BuildingTrust, 2025.

Markdown

[Srivastava. "The Fundamental Limits of LLM Unlearning: Complexity-Theoretic Barriers and Provably Optimal Protocols." ICLR 2025 Workshops: BuildingTrust, 2025.](https://mlanthology.org/iclrw/2025/srivastava2025iclrw-fundamental/)

BibTeX

@inproceedings{srivastava2025iclrw-fundamental,
  title     = {{The Fundamental Limits of LLM Unlearning: Complexity-Theoretic Barriers and Provably Optimal Protocols}},
  author    = {Srivastava, Aviral},
  booktitle = {ICLR 2025 Workshops: BuildingTrust},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/srivastava2025iclrw-fundamental/}
}