Scoring Your Prediction on Unseen Data

Abstract

The performance of deep neural networks can vary substantially when evaluated on datasets different from the training data. This presents a crucial challenge in evaluating models on unseen data without access to labels. Previous methods compute a single model-based indicator at the dataset level and use regression methods to predict performance. To evaluate the model more accurately, we propose a sample-level label-free model evaluation method for better prediction on unseen data, named Scoring Your Prediction (SYP). Specifically, SYP introduces low-level image-based features (e.g., blurriness) to model image quality that is important for classification. We complementarily combine model-based indicators and image-based indicators to enhance sample representation. Additionally, we predict the probability that each sample is correctly classified using a neural network named oracle model. Compared to other existing methods, the proposed method outperforms them on 40 unlabeled datasets transformed by CIFAR-10. Especially, SYP lowers RMSE by 1.83-3.97 for ResNet-56 evaluation and 2.32-9.74 for RepVGG-A0 evaluation compared with latest methods. Note that our scheme won the championship on the DataCV Challenge at CVPR 2023. Source code is avaliabe at https://github.com/megvii-research/SYP.

Cite

Text

Chen et al. "Scoring Your Prediction on Unseen Data." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023. doi:10.1109/CVPRW59228.2023.00330

Markdown

[Chen et al. "Scoring Your Prediction on Unseen Data." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023.](https://mlanthology.org/cvprw/2023/chen2023cvprw-scoring/) doi:10.1109/CVPRW59228.2023.00330

BibTeX

@inproceedings{chen2023cvprw-scoring,
  title     = {{Scoring Your Prediction on Unseen Data}},
  author    = {Chen, Yuhao and Zhang, Shen and Song, Renjie},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2023},
  pages     = {3279-3288},
  doi       = {10.1109/CVPRW59228.2023.00330},
  url       = {https://mlanthology.org/cvprw/2023/chen2023cvprw-scoring/}
}