A Generalization Theory for Zero-Shot Prediction

ICML 2025 pp. 43603-43660

/icml/2025/mehta2025icml-generalization/

Abstract

A modern paradigm for generalization in machine learning and AI consists of pre-training a task-agnostic foundation model, generally obtained using self-supervised and multimodal contrastive learning. The resulting representations can be used for prediction on a downstream task for which no labeled data is available. We present a theoretical framework to better understand this approach, called zero-shot prediction. We identify the target quantities that zero-shot prediction aims to learn, or learns in passing, and the key conditional independence relationships that enable its generalization ability.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Mehta and Harchaoui. "A Generalization Theory for Zero-Shot Prediction." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Mehta and Harchaoui. "A Generalization Theory for Zero-Shot Prediction." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/mehta2025icml-generalization/)

BibTeX

@inproceedings{mehta2025icml-generalization,
  title     = {{A Generalization Theory for Zero-Shot Prediction}},
  author    = {Mehta, Ronak and Harchaoui, Zaid},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {43603-43660},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/mehta2025icml-generalization/}
}