ODP-Bench: Benchmarking Out-of-Distribution Performance Prediction

Abstract

Recently, there has been gradually more attention paid to Out-of-Distribution (OOD) performance prediction, whose goal is to predict the performance of trained models on unlabeled OOD test datasets, so that we could better leverage and deploy off-the-shelf trained models in risk-sensitive scenarios. Although progress has been made in this area, evaluation protocols in previous literature are inconsistent, and most works cover only a limited number of real-world OOD datasets and types of distribution shifts. To provide convenient and fair comparisons for various algorithms, we propose Out-of-Distribution Performance Prediction Benchmark (ODP-Bench), a comprehensive benchmark that includes most commonly used OOD datasets and existing practical performance prediction algorithms. We provide our trained models as a testbench for future researchers, thus guaranteeing the consistency of comparison and avoiding the burden of repeating the model training process. Furthermore, we also conduct in-depth experimental analyses to better understand their capability boundary.

Cite

Text

Yu et al. "ODP-Bench: Benchmarking Out-of-Distribution Performance Prediction." International Conference on Computer Vision, 2025.

Markdown

[Yu et al. "ODP-Bench: Benchmarking Out-of-Distribution Performance Prediction." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/yu2025iccv-odpbench/)

BibTeX

@inproceedings{yu2025iccv-odpbench,
  title     = {{ODP-Bench: Benchmarking Out-of-Distribution Performance Prediction}},
  author    = {Yu, Han and Li, Kehan and Li, Dongbai and He, Yue and Zhang, Xingxuan and Cui, Peng},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {1846-1858},
  url       = {https://mlanthology.org/iccv/2025/yu2025iccv-odpbench/}
}