Certified Robustness in NLP Under Bounded Levenshtein Distance

Abstract

Natural Language Processing (NLP) models suffer from small perturbations, that if chosen adversarially, can dramatically change the output of the model. Verification methods can provide robustness certificates against such adversarial perturbations, by computing a sound lower bound on the robust accuracy. Nevertheless, existing verification methods in NLP incur in prohibitive costs and cannot practically handle Levenshtein distance constraints. We propose the first method for computing the Lipschitz constant of convolutional classifiers with respect to the Levenshtein distance. We use this Lipschitz constant estimation method for training 1-Lipschitz classifiers. This enables computing the certified radius of a classifier in a single forward pass. Our method, LipsLev, is able to obtain $38.80$% and $13.93$% verified accuracy at distance $1$ and $2$ respectively in the AG-News dataset. We believe our work can open the door to more efficiently training and verifying NLP models.

Cite

Text

Rocamora et al. "Certified Robustness in NLP Under Bounded Levenshtein Distance." ICML 2024 Workshops: NextGenAISafety, 2024.

Markdown

[Rocamora et al. "Certified Robustness in NLP Under Bounded Levenshtein Distance." ICML 2024 Workshops: NextGenAISafety, 2024.](https://mlanthology.org/icmlw/2024/rocamora2024icmlw-certified/)

BibTeX

@inproceedings{rocamora2024icmlw-certified,
  title     = {{Certified Robustness in NLP Under Bounded Levenshtein Distance}},
  author    = {Rocamora, Elias Abad and Chrysos, Grigorios and Cevher, Volkan},
  booktitle = {ICML 2024 Workshops: NextGenAISafety},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/rocamora2024icmlw-certified/}
}