An Empirical Evaluation of the Rashomon Effect in Explainable Machine Learning

Abstract

The Rashomon Effect describes the following phenomenon: for a given dataset there may exist many models with equally good performance but with different solution strategies. The Rashomon Effect has implications for Explainable Machine Learning, especially for the comparability of explanations. We provide a unified view on three different comparison scenarios and conduct a quantitative evaluation across different datasets, models, attribution methods, and metrics. We find that hyperparameter-tuning plays a role and that metric selection matters. Our results provide empirical support for previously anecdotal evidence and exhibit challenges for both scientists and practitioners.

Cite

Text

Müller et al. "An Empirical Evaluation of the Rashomon Effect in Explainable Machine Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43418-1_28

Markdown

[Müller et al. "An Empirical Evaluation of the Rashomon Effect in Explainable Machine Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/muller2023ecmlpkdd-empirical/) doi:10.1007/978-3-031-43418-1_28

BibTeX

@inproceedings{muller2023ecmlpkdd-empirical,
  title     = {{An Empirical Evaluation of the Rashomon Effect in Explainable Machine Learning}},
  author    = {Müller, Sebastian and Toborek, Vanessa and Beckh, Katharina and Jakobs, Matthias and Bauckhage, Christian and Welke, Pascal},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2023},
  pages     = {462-478},
  doi       = {10.1007/978-3-031-43418-1_28},
  url       = {https://mlanthology.org/ecmlpkdd/2023/muller2023ecmlpkdd-empirical/}
}