Raising the Numbers: Multi-Generation Adversarial Attack and Frequency-Based Defense for Heightened NLP Security
Abstract
The integration of Artificial Intelligence (AI) into applications of Natural Language Processing (NLP), such as spam detection and sentiment analysis, necessitates robust and explainable defenses against adversarial attacks—subtle input perturbations that can compromise model integrity. In this paper, we propose two novel methods, driven by Explainable AI (XAI), for enhancing the robustness of deep learning models in NLP. Traditional methods for generating adversarial examples are often resource-intensive relative to the number of samples produced, limiting their effectiveness in large-scale adversarial training. To address this problem, we propose, as our first contribution, a Multi-Generation Attack Strategy ( MGAS ) that leverages explainability techniques to generate a diverse set of adversarial examples for adversarial training. After a baseline adversarial text is crafted, we carefully perform three actions: swapping perturbations with alternatives, rolling back low-contributing words, and exchanging perturbed indices, thereby creating a diverse set of adversarial samples. Our second contribution introduces an additive correction defense mechanism that actively revises input texts at inference time. Using XAI to identify the most critical words in the input text, we substitute them with their most frequent suitable synonyms, thereby reducing the adversarial impact while preserving the model’s performance on clean data. Comprehensive evaluations demonstrate that both approaches, individually or combined, significantly enhance the robustness and transparency of AI models, offering a promising pathway for improving the security and reliability of NLP systems through XAI-driven techniques.
Cite
Text
Khemis et al. "Raising the Numbers: Multi-Generation Adversarial Attack and Frequency-Based Defense for Heightened NLP Security." Machine Learning, 2025. doi:10.1007/S10994-025-06833-XMarkdown
[Khemis et al. "Raising the Numbers: Multi-Generation Adversarial Attack and Frequency-Based Defense for Heightened NLP Security." Machine Learning, 2025.](https://mlanthology.org/mlj/2025/khemis2025mlj-raising/) doi:10.1007/S10994-025-06833-XBibTeX
@article{khemis2025mlj-raising,
title = {{Raising the Numbers: Multi-Generation Adversarial Attack and Frequency-Based Defense for Heightened NLP Security}},
author = {Khemis, Salim and Amara, Yacine and Benatia, Mohamed Akrem and Messalti, Ishak and Khanous, Mohammed Elamin Ilyas and De Baets, Bernard},
journal = {Machine Learning},
year = {2025},
pages = {207},
doi = {10.1007/S10994-025-06833-X},
volume = {114},
url = {https://mlanthology.org/mlj/2025/khemis2025mlj-raising/}
}