Improving Perturbation-Based Explanations by Understanding the Role of Uncertainty Calibration

Abstract

Perturbation-based explanations are widely utilized to enhance the transparency of machine-learning models in practice. However, their reliability is often compromised by the unknown model behavior under the specific perturbations used. This paper investigates the relationship between uncertainty calibration - the alignment of model confidence with actual accuracy - and perturbation-based explanations. We show that models systematically produce unreliable probability estimates when subjected to explainability-specific perturbations and theoretically prove that this directly undermines global and local explanation quality. To address this, we introduce ReCalX, a novel approach to recalibrate models for improved explanations while preserving their original predictions. Empirical evaluations across diverse models and datasets demonstrate that ReCalX consistently reduces perturbation-specific miscalibration most effectively while enhancing explanation robustness and the identification of globally important input features.

Cite

Text

Decker et al. "Improving Perturbation-Based Explanations by Understanding the Role of Uncertainty Calibration." Advances in Neural Information Processing Systems, 2025.

Markdown

[Decker et al. "Improving Perturbation-Based Explanations by Understanding the Role of Uncertainty Calibration." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/decker2025neurips-improving/)

BibTeX

@inproceedings{decker2025neurips-improving,
  title     = {{Improving Perturbation-Based Explanations by Understanding the Role of Uncertainty Calibration}},
  author    = {Decker, Thomas and Tresp, Volker and Buettner, Florian},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/decker2025neurips-improving/}
}