IntentBreaker: Intent-Adaptive Jailbreak Attack on Large Language Models

Guo, Shengnan; Zhai, Yuchen; Zhang, Shenyi; Zhao, Lingchen; Wang, Zhangyi

doi:10.1007/978-3-032-06078-5_14

IntentBreaker: Intent-Adaptive Jailbreak Attack on Large Language Models

Shengnan Guo, Yuchen Zhai, Shenyi Zhang, Lingchen Zhao, Zhangyi Wang

ECML-PKDD 2025 pp. 240-256

doi:10.1007/978-3-032-06078-5_14 /ecmlpkdd/2025/guo2025ecmlpkdd-intentbreaker/

Abstract

Recent research on jailbreak attacks has uncovered substantial robustness vulnerabilities in existing large language models (LLMs), enabling attackers to bypass safety guardrails through carefully crafted malicious prompts. Such prompts can induce the generation of harmful content, posing significant safety and ethical concerns. In this paper, we reveal that the difficulty of successfully jailbreaking LLMs varies considerably depending on the intent of the attacker, which inherently limits the overall attack success rate (ASR). Current approaches mostly rely on generic jailbreak templates and optimization strategies, and this lack of adaptability limits their effectiveness and efficiency across diverse jailbreak intents. To address this limitation, we introduce IntentBreaker , a novel intent-adaptive jailbreak framework built on a hybrid evolutionary algorithm. Our approach categorizes malicious prompts into nine distinct intents and incorporates three adaptive improvements: template initialization, lexicons-based fitness function, and dynamic mutation operations, which are designed to align generated outputs more closely with the attack intent. Comprehensive experimental evaluations demonstrate that IntentBreaker achieves an average ASR of 98.61% across five open-source LLMs, outperforming baseline methods by 42.25%.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Guo et al. "IntentBreaker: Intent-Adaptive Jailbreak Attack on Large Language Models." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-032-06078-5_14

Markdown

[Guo et al. "IntentBreaker: Intent-Adaptive Jailbreak Attack on Large Language Models." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/guo2025ecmlpkdd-intentbreaker/) doi:10.1007/978-3-032-06078-5_14

BibTeX

@inproceedings{guo2025ecmlpkdd-intentbreaker,
  title     = {{IntentBreaker: Intent-Adaptive Jailbreak Attack on Large Language Models}},
  author    = {Guo, Shengnan and Zhai, Yuchen and Zhang, Shenyi and Zhao, Lingchen and Wang, Zhangyi},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2025},
  pages     = {240-256},
  doi       = {10.1007/978-3-032-06078-5_14},
  url       = {https://mlanthology.org/ecmlpkdd/2025/guo2025ecmlpkdd-intentbreaker/}
}