Adaptive Localization of Knowledge Negation for Continual LLM Unlearning

Abudukelimu Wuerkaixi, Qizhou Wang, Sen Cui, Wutong Xu, Bo Han, Gang Niu, Masashi Sugiyama, Changshui Zhang

ICML 2025 pp. 68094-68117

/icml/2025/wuerkaixi2025icml-adaptive/

Abstract

With the growing deployment of large language models (LLMs) across diverse domains, concerns regarding their safety have grown substantially. LLM unlearning has emerged as a pivotal approach to removing harmful or unlawful contents while maintaining utility. Despite increasing interest, the challenges of continual unlearning, which is common in real-world scenarios, remain underexplored. Successive unlearning tasks often lead to intensified utility degradation. To effectively unlearn targeted knowledge while preserving LLM utility, it is essential to minimize changes in model parameters by selectively updating those linked to the target knowledge, thereby ensuring other knowledge remains unaffected. Building on the task vector framework, we propose a new method named ALKN (Adaptive Localization of Knowledge Negation), which uses dynamic masking to sparsify training gradients and adaptively adjusts unlearning intensity based on inter-task relationships. Comprehensive experiments across three well-established LLM unlearning datasets demonstrate that our approach consistently outperforms baseline methods in both unlearning effectiveness and utility retention under continual unlearning settings.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Wuerkaixi et al. "Adaptive Localization of Knowledge Negation for Continual LLM Unlearning." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Wuerkaixi et al. "Adaptive Localization of Knowledge Negation for Continual LLM Unlearning." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/wuerkaixi2025icml-adaptive/)

BibTeX

@inproceedings{wuerkaixi2025icml-adaptive,
  title     = {{Adaptive Localization of Knowledge Negation for Continual LLM Unlearning}},
  author    = {Wuerkaixi, Abudukelimu and Wang, Qizhou and Cui, Sen and Xu, Wutong and Han, Bo and Niu, Gang and Sugiyama, Masashi and Zhang, Changshui},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {68094-68117},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/wuerkaixi2025icml-adaptive/}
}