The Fundamental Limits of LLM Unlearning: Complexity-Theoretic Barriers and Provably Optimal Protocols
Abstract
Modern machine unlearning techniques for large language models (LLMs) re- main heuristic, lacking formal characterization of their fundamental computa- tional limits. We establish the first complexity-theoretic foundation for LLM un- learning, revealing intrinsic tradeoffs between efficiency, precision, and regulatory compliance. Our framework formalizes (ϵ, δ)-machine unlearning via measure- theoretic alignment of retrained and unlearned model distributions, then proves transformer-specific hardness results: exact unlearning is coNP-hard, while ap- proximate unlearning requires Ω(T 1−o(1)) time under the Exponential Time Hy- pothesis (ETH). We construct an optimal Recursive Sketch-and-Freeze pro- tocol achieving these bounds through differential privacy duality and Kronecker- product sketching. Crucially, we identify phase transitions in R´enyi unlearning cost at critical model scales (n ≈ d log k). These results provide (1) theoretical benchmarks for evaluating unlearning algorithms, (2) complexity-aware guide- lines for AI regulation, and (3) mathematically grounded verification tools for GDPR/CPRA compliance.
Cite
Text
Srivastava. "The Fundamental Limits of LLM Unlearning: Complexity-Theoretic Barriers and Provably Optimal Protocols." ICLR 2025 Workshops: BuildingTrust, 2025.Markdown
[Srivastava. "The Fundamental Limits of LLM Unlearning: Complexity-Theoretic Barriers and Provably Optimal Protocols." ICLR 2025 Workshops: BuildingTrust, 2025.](https://mlanthology.org/iclrw/2025/srivastava2025iclrw-fundamental/)BibTeX
@inproceedings{srivastava2025iclrw-fundamental,
title = {{The Fundamental Limits of LLM Unlearning: Complexity-Theoretic Barriers and Provably Optimal Protocols}},
author = {Srivastava, Aviral},
booktitle = {ICLR 2025 Workshops: BuildingTrust},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/srivastava2025iclrw-fundamental/}
}