What Makes a Good Explanation?: A Harmonized View of Properties of Explanations
Abstract
Interpretability provides a means for humans to verify aspects of machine learning (ML) models. Different tasks require explanations with different properties. However, presently, there is a lack of standardization in assessing properties of explanations: different papers use the same term to mean different quantities, and different terms to mean the same quantity. This lack of standardization prevents us from rigorously comparing explanation systems. In this work, we survey explanation properties defined in the current interpretable ML literature, we synthesize properties based on what they measure, and describe the trade-offs between different formulations of these properties. We provide a unifying framework for comparing properties of interpretable ML.
Cite
Text
Subhash et al. "What Makes a Good Explanation?: A Harmonized View of Properties of Explanations." NeurIPS 2022 Workshops: TEA, 2022.Markdown
[Subhash et al. "What Makes a Good Explanation?: A Harmonized View of Properties of Explanations." NeurIPS 2022 Workshops: TEA, 2022.](https://mlanthology.org/neuripsw/2022/subhash2022neuripsw-makes/)BibTeX
@inproceedings{subhash2022neuripsw-makes,
title = {{What Makes a Good Explanation?: A Harmonized View of Properties of Explanations}},
author = {Subhash, Varshini and Chen, Zixi and Havasi, Marton and Pan, Weiwei and Doshi-Velez, Finale},
booktitle = {NeurIPS 2022 Workshops: TEA},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/subhash2022neuripsw-makes/}
}