OmniPred: Language Models as Universal Regressors

Abstract

Regression is a powerful tool to accurately predict the outcome metric of a system given a set of parameters, but has traditionally been restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training language models as universal end-to-end regressors over (x,y) data from arbitrary formats. Using data sourced from Google Vizier, one of the largest proprietary blackbox optimization databases in the world, our extensive experiments demonstrate that language models are capable of very precise numerical regression using only textual representations of mathematical parameters and values, and if given the opportunity to train at scale over multiple tasks, can significantly outperform traditional regression models.

Cite

Text

Song et al. "OmniPred: Language Models as Universal Regressors." Transactions on Machine Learning Research, 2024.

Markdown

[Song et al. "OmniPred: Language Models as Universal Regressors." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/song2024tmlr-omnipred/)

BibTeX

@article{song2024tmlr-omnipred,
  title     = {{OmniPred: Language Models as Universal Regressors}},
  author    = {Song, Xingyou and Li, Oscar and Lee, Chansoo and Yang, Bangding and Peng, Daiyi and Perel, Sagi and Chen, Yutian},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/song2024tmlr-omnipred/}
}