ExpProof : Operationalizing Explanations for Confidential Models with ZKPs
Abstract
In principle, explanations are intended as a way to increase trust in machine learn- ing models and are often obligated by regulations. However, many circumstances where these are demanded are adversarial in nature, meaning the involved parties have misaligned interests and are incentivized to manipulate explanations for their purpose. As a result, explainability methods fail to be operational in such settings despite the demand Bordt et al. (2022). In this paper, we take a step towards op- erationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs (ZKPs), a cryptographic primitive. Specifically we explore ZKP-amenable ver- sions of the popular explainability algorithm LIME and evaluate their performance on Neural Networks and Random Forests. Our code is publicly available at : https://github.com/infinite-pursuits/ExpProof.
Cite
Text
Yadav et al. "ExpProof : Operationalizing Explanations for Confidential Models with ZKPs." ICLR 2025 Workshops: BuildingTrust, 2025.Markdown
[Yadav et al. "ExpProof : Operationalizing Explanations for Confidential Models with ZKPs." ICLR 2025 Workshops: BuildingTrust, 2025.](https://mlanthology.org/iclrw/2025/yadav2025iclrw-expproof/)BibTeX
@inproceedings{yadav2025iclrw-expproof,
title = {{ExpProof : Operationalizing Explanations for Confidential Models with ZKPs}},
author = {Yadav, Chhavi and Laufer, Evan and Boneh, Dan and Chaudhuri, Kamalika},
booktitle = {ICLR 2025 Workshops: BuildingTrust},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/yadav2025iclrw-expproof/}
}