Measuring Calibration in Deep Learning
Abstract
The reliability of a machine learning model's confidence in its predictions is critical for high-risk applications. Calibration--the idea that a model's predicted probabilities of outcomes reflect true probabilities of those outcomes--formalizes this notion. Current calibration metrics fail to consider all of the predictions made by machine learning models, and are in- efficient in their estimation of the calibration error. We design the Adaptive Calibration Error (ACE) metric to resolve these pathologies and show that it outperforms other metrics, especially in settings where predictions beyond the maximum prediction that is chosen as the output class matter.
Cite
Text
Nixon et al. "Measuring Calibration in Deep Learning." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.Markdown
[Nixon et al. "Measuring Calibration in Deep Learning." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.](https://mlanthology.org/cvprw/2019/nixon2019cvprw-measuring/)BibTeX
@inproceedings{nixon2019cvprw-measuring,
title = {{Measuring Calibration in Deep Learning}},
author = {Nixon, Jeremy and Dusenberry, Michael W. and Zhang, Linchuan and Jerfel, Ghassen and Tran, Dustin},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2019},
pages = {38-41},
url = {https://mlanthology.org/cvprw/2019/nixon2019cvprw-measuring/}
}