Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models

Abstract

Natural language processing (NLP) models have been increasingly used in sensitive application domains including credit scoring, insurance, and loan assessment. Hence, it is critical to know that the decisions made by NLP models are free of unfair bias toward certain subpopulation groups. In this paper, we propose a novel framework employing metamorphic testing, a well-established software testing scheme, to test NLP models and find discriminatory inputs that provoke fairness violations. Furthermore, inspired by recent breakthroughs in the certified robustness of machine learning, we formulate NLP model fairness in a practical setting as (ε, k)-fairness and accordingly smooth the model predictions to mitigate fairness violations. We demonstrate our technique using popular (commercial) NLP models, and successfully flag thousands of discriminatory inputs that can cause fairness violations. We further enhance the evaluated models by adding certified fairness guarantee at a modest cost.

Cite

Text

Ma et al. "Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models." International Joint Conference on Artificial Intelligence, 2020. doi:10.24963/IJCAI.2020/64

Markdown

[Ma et al. "Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models." International Joint Conference on Artificial Intelligence, 2020.](https://mlanthology.org/ijcai/2020/ma2020ijcai-metamorphic/) doi:10.24963/IJCAI.2020/64

BibTeX

@inproceedings{ma2020ijcai-metamorphic,
  title     = {{Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models}},
  author    = {Ma, Pingchuan and Wang, Shuai and Liu, Jin},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {458-465},
  doi       = {10.24963/IJCAI.2020/64},
  url       = {https://mlanthology.org/ijcai/2020/ma2020ijcai-metamorphic/}
}