LLM-as-a-Prophet: Understanding Predictive Intelligence with Prophet Arena

Abstract

With the rapid progress of large language models (LLMs) trained on every available piece of data, it becomes increasingly challenging to reliably evaluate their intelligence due to potential data contamination and benchmark overfitting. To overcome these challenges, we investigate a new angle of benchmarking LLMs' intelligence by evaluating their capability in forecasting real-world future events, a paradigm we call "LLM-as-a-Prophet". Such forecasting tasks require combination of sophisticated capabilities while remaining free from data contamination or overfitting. To systematically evaluate such predictive intelligence of LLMs, we introduce $\texttt{Prophet Arena}$, a general evaluation benchmark that continuously collects live forecasting tasks and decomposes each task into distinct pipeline stages, supporting our controlled and large-scale experimentation. Our comprehensive evaluation reveals that many LLMs already exhibit impressive forecasting capabilities, reflected in, e.g., their small calibration errors, consistent prediction confidence and promising market returns. However, we also uncover key bottlenecks even in frontier models, such as inaccurate event recalls, misunderstanding of data sources and slower information aggregation compared to markets when resolution nears.

Cite

Text

Yang et al. "LLM-as-a-Prophet: Understanding Predictive Intelligence with Prophet Arena." International Conference on Learning Representations, 2026.

Markdown

[Yang et al. "LLM-as-a-Prophet: Understanding Predictive Intelligence with Prophet Arena." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/yang2026iclr-llmasaprophet/)

BibTeX

@inproceedings{yang2026iclr-llmasaprophet,
  title     = {{LLM-as-a-Prophet: Understanding Predictive Intelligence with Prophet Arena}},
  author    = {Yang, Qingchuan and Mahns, Simon and Li, Sida and Gu, Anri and Wu, Jibang and Xu, Haifeng},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/yang2026iclr-llmasaprophet/}
}