Assessing Vulnerabilities in State-of-the-Art Large Language Models Through Hex Injection (Student Abstract)

Abstract

State-of-the-art large language models (LLMs) are designed with robust safeguards to prevent the disclosure of harmful information and dangerous procedures. However, "jailbreaking" techniques can circumvent these protections by exploiting vulnerabilities in the models. This paper introduces a novel method, Hex Injection, which leverages a specific weakness in LLMs' ability to decode encoded text to uncover concealed dangerous instructions. Hex Injection distinguishes itself from traditional methods by combining encoded instructions with plaintext prompts to reveal unsafe content more effectively. Our approach involves encoding potentially malicious prompts in hexadecimal and integrating them. We observe a 94% average success rate (ASR) with a combination of plaintext, encoded, and role-play for Llama 3 and 3.1 models, and an 86% ASR for the Gemma 2 model. This research not only advances the understanding of LLM security but also offers valuable insights for improving safety mechanisms in artificial intelligence systems.

Cite

Text

Gu and Liu. "Assessing Vulnerabilities in State-of-the-Art Large Language Models Through Hex Injection (Student Abstract)." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I28.35257

Markdown

[Gu and Liu. "Assessing Vulnerabilities in State-of-the-Art Large Language Models Through Hex Injection (Student Abstract)." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/gu2025aaai-assessing/) doi:10.1609/AAAI.V39I28.35257

BibTeX

@inproceedings{gu2025aaai-assessing,
  title     = {{Assessing Vulnerabilities in State-of-the-Art Large Language Models Through Hex Injection (Student Abstract)}},
  author    = {Gu, Da Cheng and Liu, Wei},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {29377-29378},
  doi       = {10.1609/AAAI.V39I28.35257},
  url       = {https://mlanthology.org/aaai/2025/gu2025aaai-assessing/}
}