Testing Whether a Learning Procedure Is Calibrated

Abstract

A learning procedure takes as input a dataset and performs inference for the parameters $\theta$ of a model that is assumed to have given rise to the dataset. Here we consider learning procedures whose output is a probability distribution, representing uncertainty about $\theta$ after seeing the dataset. Bayesian inference is a prime example of such a procedure, but one can also construct other learning procedures that return distributional output. This paper studies conditions for a learning procedure to be considered calibrated, in the sense that the true data-generating parameters are plausible as samples from its distributional output. A learning procedure whose inferences and predictions are systematically over- or under-confident will fail to be calibrated. On the other hand, a learning procedure that is calibrated need not be statistically efficient. A hypothesis-testing framework is developed in order to assess, using simulation, whether a learning procedure is calibrated. Several vignettes are presented to illustrate different aspects of the framework.

Cite

Text

Cockayne et al. "Testing Whether a Learning Procedure Is Calibrated." Journal of Machine Learning Research, 2022.

Markdown

[Cockayne et al. "Testing Whether a Learning Procedure Is Calibrated." Journal of Machine Learning Research, 2022.](https://mlanthology.org/jmlr/2022/cockayne2022jmlr-testing/)

BibTeX

@article{cockayne2022jmlr-testing,
  title     = {{Testing Whether a Learning Procedure Is Calibrated}},
  author    = {Cockayne, Jon and Graham, Matthew M. and Oates, Chris J. and Sullivan, T. J. and Teymur, Onur},
  journal   = {Journal of Machine Learning Research},
  year      = {2022},
  pages     = {1-36},
  volume    = {23},
  url       = {https://mlanthology.org/jmlr/2022/cockayne2022jmlr-testing/}
}