A Dual-Perspective Approach to Evaluating Feature Attribution Methods

Abstract

Feature attribution methods attempt to explain neural network predictions by identifying relevant features. However, establishing a cohesive framework for assessing feature attribution remains a challenge. There are several views through which we can evaluate attributions. One principal lens is to observe the effect of perturbing attributed features on the model’s behavior (i.e., faithfulness). While providing useful insights, existing faithfulness evaluations suffer from shortcomings that we reveal in this paper. To address the limitations of previous evaluations, in this work, we propose two new perspectives within the faithfulness paradigm that reveal intuitive properties: soundness and completeness. Soundness assesses the degree to which attributed features are truly predictive features, while completeness examines how well the resulting attribution reveals all the predictive features. The two perspectives are based on a firm mathematical foundation and provide quantitative metrics that are computable through efficient algorithms. We apply these metrics to mainstream attribution methods, offering a novel lens through which to analyze and compare feature attribution methods.

Cite

Text

Li et al. "A Dual-Perspective Approach to Evaluating Feature Attribution Methods." Transactions on Machine Learning Research, 2024.

Markdown

[Li et al. "A Dual-Perspective Approach to Evaluating Feature Attribution Methods." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/li2024tmlr-dualperspective/)

BibTeX

@article{li2024tmlr-dualperspective,
  title     = {{A Dual-Perspective Approach to Evaluating Feature Attribution Methods}},
  author    = {Li, Yawei and Zhang, Yang and Kawaguchi, Kenji and Khakzar, Ashkan and Bischl, Bernd and Rezaei, Mina},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/li2024tmlr-dualperspective/}
}