Unjustified Classification Regions and Counterfactual Explanations in Machine Learning
Abstract
Post-hoc interpretability approaches, although powerful tools to generate explanations for predictions made by a trained black-box model, have been shown to be vulnerable to issues caused by lack of robustness of the classifier. In particular, this paper focuses on the notion of explanation justification, defined as connectedness to ground-truth data, in the context of counterfactuals. In this work, we explore the extent of the risk of generating unjustified explanations. We propose an empirical study to assess the vulnerability of classifiers and show that the chosen learning algorithm heavily impacts the vulnerability of the model. Additionally, we show that state-of-the-art post-hoc counterfactual approaches can minimize the impact of this risk by generating less local explanations (Source code available at: https://github.com/thibaultlaugel/truce ).
Cite
Text
Laugel et al. "Unjustified Classification Regions and Counterfactual Explanations in Machine Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2019. doi:10.1007/978-3-030-46147-8_3Markdown
[Laugel et al. "Unjustified Classification Regions and Counterfactual Explanations in Machine Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2019.](https://mlanthology.org/ecmlpkdd/2019/laugel2019ecmlpkdd-unjustified/) doi:10.1007/978-3-030-46147-8_3BibTeX
@inproceedings{laugel2019ecmlpkdd-unjustified,
title = {{Unjustified Classification Regions and Counterfactual Explanations in Machine Learning}},
author = {Laugel, Thibault and Lesot, Marie-Jeanne and Marsala, Christophe and Renard, Xavier and Detyniecki, Marcin},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2019},
pages = {37-54},
doi = {10.1007/978-3-030-46147-8_3},
url = {https://mlanthology.org/ecmlpkdd/2019/laugel2019ecmlpkdd-unjustified/}
}