Approaching Human-Level Forecasting with Language Models

Abstract

Forecasting future events is important for policy and decision making. In this work, we study whether language models (LMs) can forecast at the level of competitive human forecasters. Towards this goal, we develop a retrieval-augmented LM system designed to automatically search for relevant information, generate forecasts, and aggregate predictions. To facilitate our study, we collect a large dataset of questions from competitive forecasting platforms. Under a test set published after the knowledge cut-offs of our LMs, we evaluate the end-to-end performance of our system against the aggregates of human forecasts. On average, the system nears the crowd aggregate of competitive forecasters and, in a certain relaxed setting, surpasses it. Our work suggests that using LMs to forecasts the future could provide accurate predictions at scale and help to inform institutional decision making.

Cite

Text

Halawi et al. "Approaching Human-Level Forecasting with Language Models." Neural Information Processing Systems, 2024. doi:10.52202/079017-1598

Markdown

[Halawi et al. "Approaching Human-Level Forecasting with Language Models." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/halawi2024neurips-approaching/) doi:10.52202/079017-1598

BibTeX

@inproceedings{halawi2024neurips-approaching,
  title     = {{Approaching Human-Level Forecasting with Language Models}},
  author    = {Halawi, Danny and Zhang, Fred and Yueh-Han, Chen and Steinhardt, Jacob},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-1598},
  url       = {https://mlanthology.org/neurips/2024/halawi2024neurips-approaching/}
}