I-Trustworthy Models. a Framework for Trustworthiness Evaluation of Probabilistic Classifiers

Abstract

As probabilistic models continue to permeate various facets of our society and contribute to scientific advancements, it becomes a necessity to go beyond traditional metrics such as predictive accuracy and error rates and assess their trustworthiness. Grounded in the competence-based theory of trust, this work formalizes I-trustworthy framework – a novel framework for assessing the trustworthiness of probabilistic classifiers for inference tasks by linking conditional calibration to trustworthiness. To assess I-trustworthiness, we use the local calibration error (LCE) and develop a method of hypothesis-testing. This method utilizes a kernel-based test statistic, Kernel Local Calibration Error (KLCE), to test local calibration of a probabilistic classifier. This study provides theoretical guarantees by offering convergence bounds for an unbiased estimator of KLCE. Additionally, we present a diagnostic tool designed to identify and measure biases in cases of miscalibration. The effectiveness of the proposed test statistic is demonstrated through its application to both simulated and real-world datasets. Finally, LCE of related recalibration methods is studied, and we provide evidence of insufficiency of existing methods to achieve I-trustworthiness.

Cite

Text

Vashistha and Farahi. "I-Trustworthy Models. a Framework for Trustworthiness Evaluation of Probabilistic Classifiers." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.

Markdown

[Vashistha and Farahi. "I-Trustworthy Models. a Framework for Trustworthiness Evaluation of Probabilistic Classifiers." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/vashistha2025aistats-itrustworthy/)

BibTeX

@inproceedings{vashistha2025aistats-itrustworthy,
  title     = {{I-Trustworthy Models. a Framework for Trustworthiness Evaluation of Probabilistic Classifiers}},
  author    = {Vashistha, Ritwik and Farahi, Arya},
  booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  year      = {2025},
  pages     = {4726-4734},
  volume    = {258},
  url       = {https://mlanthology.org/aistats/2025/vashistha2025aistats-itrustworthy/}
}