The Disagreement Problem in Faithfulness Metrics

Abstract

The field of explainable artificial intelligence (XAI) aims to explain how black-box machine learning models work. Much of the work centers around the holy grail of providing post-hoc feature attributions to any model architecture. While the pace of innovation around novel methods has slowed down, the question remains of how to choose a method, and how to make it fit for purpose. Recently, efforts around benchmarking XAI methods have suggested metrics for that purpose—but there are many choices. That bounty of choice still leaves an end user unclear on how to proceed. This paper focuses on comparing metrics with the aim of measuring faithfulness of local explanations on tabular classification problems—and shows that the current metrics don’t agree; leaving users unsure how to choose the most faithful explanations.

Cite

Text

Barr et al. "The Disagreement Problem in Faithfulness Metrics." NeurIPS 2023 Workshops: XAIA, 2023.

Markdown

[Barr et al. "The Disagreement Problem in Faithfulness Metrics." NeurIPS 2023 Workshops: XAIA, 2023.](https://mlanthology.org/neuripsw/2023/barr2023neuripsw-disagreement/)

BibTeX

@inproceedings{barr2023neuripsw-disagreement,
  title     = {{The Disagreement Problem in Faithfulness Metrics}},
  author    = {Barr, Brian and Fatsi, Noah and Hancox-Li, Leif and Richter, Peter and Proano, Daniel},
  booktitle = {NeurIPS 2023 Workshops: XAIA},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/barr2023neuripsw-disagreement/}
}