Predicting the Performance of Foundation Models via Agreement-on-the-Line

Abstract

Estimating the out-of-distribution performance in regimes where labels are scarce is critical to safely deploy foundation models. Recently, it was shown that ensembles of neural networks observe the phenomena "agreement-on-the-line", which can be leveraged to reliably predict OOD performance without labels. However, in contrast to classical neural networks that are trained on in-distribution data from scratch for numerous epochs, foundation models undergo minimal finetuning from heavily pretrained weights, which may reduce the ensemble diversity needed to observe agreement-on-the-line. In our work, we demonstrate that when lightly finetuning multiple runs from a $\textit{single}$ foundation model, the choice of randomness during training (linear head initialization, data ordering, and data subsetting) can lead to drastically different levels of agreement-on-the-line in the resulting ensemble. Surprisingly, only random head initialization is able to reliably induce agreement-on-the-line in finetuned foundation models across vision and language benchmarks. Second, we demonstrate that ensembles of $\textit{multiple}$ foundation models pretrained on different datasets but finetuned on the same task can also show agreement-on-the-line. In total, by careful construction of a diverse ensemble, we can utilize agreement-on-the-line-based methods to predict the OOD performance of foundation models with high precision.

Cite

Text

Saxena et al. "Predicting the Performance of Foundation Models via Agreement-on-the-Line." Neural Information Processing Systems, 2024. doi:10.52202/079017-1002

Markdown

[Saxena et al. "Predicting the Performance of Foundation Models via Agreement-on-the-Line." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/saxena2024neurips-predicting/) doi:10.52202/079017-1002

BibTeX

@inproceedings{saxena2024neurips-predicting,
  title     = {{Predicting the Performance of Foundation Models via Agreement-on-the-Line}},
  author    = {Saxena, Rahul and Kim, Taeyoun and Mehra, Aman and Baek, Christina and Kolter, Zico and Raghunathan, Aditi},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-1002},
  url       = {https://mlanthology.org/neurips/2024/saxena2024neurips-predicting/}
}