Efficient XAI: A Low-Cost Data Reduction Approach to SHAP Interpretability
Abstract
Explainable Artificial Intelligence (XAI) has become a critical area of research, particularly in ensuring transparency and trustworthiness in machine learning (ML) models. In this context SHAP (SHapley Additive exPlanations) is widely recognized as a robust method for feature attribution, yet its computational cost poses significant challenges, especially for large datasets. This study explores a novel approach to optimizing SHAP computations by leveraging Slovin’s formula, a statistical sampling technique traditionally used in survey research. Unlike feature selection or dimensionality reduction methods, Slovin’s formula requires minimal prior knowledge of the dataset’s statistical properties while providing an efficient, heuristic-based alternative for data reduction. It offers a straightforward, low-cost sampling approach that can be applied without extensive preprocessing, making it accessible for computationally constrained environments. Through controlled experiments on synthetic datasets, we analyze the stability of SHAP values under Slovin-based subsampling across varying data characteristics, including feature and target types and distributions, and dataset sizes. Our findings reveal a U-shaped trade-off: SHAP values for midranked features remain stable, whereas extreme values exhibit higher fluctuations. Additionally, categorical and non-skewed distributed features maintain greater robustness, while highly skewed target distributions introduce variability. Importantly, the effectiveness of Slovin’s formula diminishes when the subsample-to-sample ratio falls below 5%. By integrating Slovin’s formula into SHAP workflows, we demonstrate a practical solution for balancing interpretability and computational efficiency in machine learning. This method reduces processing costs while retaining key feature attributions, making it particularly valuable for researchers and practitioners working with resource-constrained environments. Our study contributes to the broader discourse on sustainable AI, offering a scalable and interpretable framework for advancing explainability in modern machine learning systems.
Cite
Text
Bachmann. "Efficient XAI: A Low-Cost Data Reduction Approach to SHAP Interpretability." Journal of Artificial Intelligence Research, 2025. doi:10.1613/JAIR.1.18325Markdown
[Bachmann. "Efficient XAI: A Low-Cost Data Reduction Approach to SHAP Interpretability." Journal of Artificial Intelligence Research, 2025.](https://mlanthology.org/jair/2025/bachmann2025jair-efficient/) doi:10.1613/JAIR.1.18325BibTeX
@article{bachmann2025jair-efficient,
title = {{Efficient XAI: A Low-Cost Data Reduction Approach to SHAP Interpretability}},
author = {Bachmann, Severin},
journal = {Journal of Artificial Intelligence Research},
year = {2025},
doi = {10.1613/JAIR.1.18325},
volume = {83},
url = {https://mlanthology.org/jair/2025/bachmann2025jair-efficient/}
}