InfoBERT: Improving Robustness of Language Models from an Information Theoretic Perspective

Abstract

Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks. Recent studies, however, show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks. We aim to address this problem from an information-theoretic perspective, and propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models. InfoBERT contains two mutual-information-based regularizers for model training: (i) an Information Bottleneck regularizer, which suppresses noisy mutual information between the input and the feature representation; and (ii) a Robust Feature regularizer, which increases the mutual information between local robust features and global features. We provide a principled way to theoretically analyze and improve the robustness of representation learning for language models in both standard and adversarial training. Extensive experiments demonstrate that InfoBERT achieves state-of-the-art robust accuracy over several adversarial datasets on Natural Language Inference (NLI) and Question Answering (QA) tasks. Our code is available at https://github.com/AI-secure/InfoBERT.

Cite

Text

Wang et al. "InfoBERT: Improving Robustness of Language Models from an Information Theoretic Perspective." International Conference on Learning Representations, 2021.

Markdown

[Wang et al. "InfoBERT: Improving Robustness of Language Models from an Information Theoretic Perspective." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/wang2021iclr-infobert/)

BibTeX

@inproceedings{wang2021iclr-infobert,
  title     = {{InfoBERT: Improving Robustness of Language Models from an Information Theoretic Perspective}},
  author    = {Wang, Boxin and Wang, Shuohang and Cheng, Yu and Gan, Zhe and Jia, Ruoxi and Li, Bo and Liu, Jingjing},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/wang2021iclr-infobert/}
}