Mdfa: Multi-Differential Fairness Auditor for Black Box Classifiers
Abstract
Machine learning algorithms are increasingly involved in sensitive decision-making processes with adversarial implications on individuals. This paper presents a new tool, mdfa that identifies the characteristics of the victims of a classifier's discrimination. We measure discrimination as a violation of multi-differential fairness. Multi-differential fairness is a guarantee that a black box classifier's outcomes do not leak information on the sensitive attributes of a small group of individuals. We reduce the problem of identifying worst-case violations to matching distributions and predicting where sensitive attributes and classifier's outcomes coincide. We apply mdfa to a recidivism risk assessment classifier widely used in the United States and demonstrate that for individuals with little criminal history, identified African-Americans are three-times more likely to be considered at high risk of violent recidivism than similar non-African-Americans.
Cite
Text
Gitiaux and Rangwala. "Mdfa: Multi-Differential Fairness Auditor for Black Box Classifiers." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/814Markdown
[Gitiaux and Rangwala. "Mdfa: Multi-Differential Fairness Auditor for Black Box Classifiers." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/gitiaux2019ijcai-mdfa/) doi:10.24963/IJCAI.2019/814BibTeX
@inproceedings{gitiaux2019ijcai-mdfa,
title = {{Mdfa: Multi-Differential Fairness Auditor for Black Box Classifiers}},
author = {Gitiaux, Xavier and Rangwala, Huzefa},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2019},
pages = {5871-5879},
doi = {10.24963/IJCAI.2019/814},
url = {https://mlanthology.org/ijcai/2019/gitiaux2019ijcai-mdfa/}
}