Learning Prediction Intervals for Model Performance

Abstract

Understanding model performance on unlabeled data is a fundamental challenge of developing, deploying, and maintaining AI systems. Model performance is typically evaluated using test sets or periodic manual quality assessments, both of which require laborious manual data labeling. Automated performance prediction techniques aim to mitigate this burden, but potential inaccuracy and a lack of trust in their predictions has prevented their widespread adoption. We address this core problem of performance prediction uncertainty with a method to compute prediction intervals for model performance. Our methodology uses transfer learning to train an uncertainty model to estimate the uncertainty of model performance predictions. We evaluate our approach across a wide range of drift conditions and show substantial improvement over competitive baselines. We believe this result makes prediction intervals, and performance prediction in general, significantly more practical for real-world use.

Cite

Text

Elder et al. "Learning Prediction Intervals for Model Performance." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I8.16897

Markdown

[Elder et al. "Learning Prediction Intervals for Model Performance." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/elder2021aaai-learning/) doi:10.1609/AAAI.V35I8.16897

BibTeX

@inproceedings{elder2021aaai-learning,
  title     = {{Learning Prediction Intervals for Model Performance}},
  author    = {Elder, Benjamin and Arnold, Matthew and Murthi, Anupama and Navrátil, Jirí},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {7305-7313},
  doi       = {10.1609/AAAI.V35I8.16897},
  url       = {https://mlanthology.org/aaai/2021/elder2021aaai-learning/}
}