AttentionBreaker: Adaptive Evolutionary Optimization for Unmasking Vulnerabilities in LLMs Through Bit-Flip Attacks

Sanjay Das, Swastik Bhattacharya, Souvik Kundu, Shamik Kundu, Anand Menon, Arnab Raha, Kanad Basu

TMLR 2025

/tmlr/2025/das2025tmlr-attentionbreaker/

Abstract

Large language models (LLMs) have significantly advanced natural language processing (NLP) yet are still susceptible to hardware-based threats, particularly bit-flip attacks (BFAs). Traditional BFA techniques, requiring iterative gradient recalculations after each bit-flip, become computationally prohibitive and lead to memory exhaustion as model size grows, making them impractical for state-of-the-art LLMs. To overcome these limitations, we propose AttentionBreaker, a novel framework for efficient parameter space exploration, incorporating GenBFA, an evolutionary optimization method that identifies the most vulnerable bits in LLMs. Our approach demonstrates unprecedented efficacy—flipping just three bits in the LLaMA3-8B-Instruct model, quantized to 8-bit weights (W8), completely collapses performance, reducing Massive Multitask Language Understanding (MMLU) accuracy from 67.3% to 0% and increasing Wikitext perplexity by a factor of $10^5$. Furthermore, AttentionBreaker circumvents existing defenses against BFAs on transformer-based architectures, exposing a critical security risk. The framework is made open sourced at: https://github.com/TIES-Lab/attnbreaker.

PDF TMLR Semantic Scholar

Cite

Text

Das et al. "AttentionBreaker: Adaptive Evolutionary Optimization for Unmasking Vulnerabilities in LLMs Through Bit-Flip Attacks." Transactions on Machine Learning Research, 2025.

Markdown

[Das et al. "AttentionBreaker: Adaptive Evolutionary Optimization for Unmasking Vulnerabilities in LLMs Through Bit-Flip Attacks." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/das2025tmlr-attentionbreaker/)

BibTeX

@article{das2025tmlr-attentionbreaker,
  title     = {{AttentionBreaker: Adaptive Evolutionary Optimization for Unmasking Vulnerabilities in LLMs Through Bit-Flip Attacks}},
  author    = {Das, Sanjay and Bhattacharya, Swastik and Kundu, Souvik and Kundu, Shamik and Menon, Anand and Raha, Arnab and Basu, Kanad},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/das2025tmlr-attentionbreaker/}
}