GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

Abstract

We introduce GDPval, a benchmark evaluating AI model capabilities on real-world economically valuable knowledge-work tasks. GDPval covers the majority of Department of Labor O*NET Work Activities for 44 occupations across the top 9 sectors contributing to U.S. GDP (Gross Domestic Product). Tasks are constructed from the representative work of industry professionals with an average of 14 years of experience. We find that frontier model performance on GDPval is improving roughly linearly over time, and that the current best frontier models are approaching industry experts in deliverable quality. We analyze the potential for frontier models, when paired with human oversight, to perform GDPval tasks cheaper and faster than unaided experts. We also demonstrate that increased reasoning effort, increased task context, and increased scaffolding improves model performance on GDPval. Finally, we open-source a gold subset of 220 tasks and provide a public automated grading service to facilitate future research in understanding real-world model capabilities.

Cite

Text

Patwardhan et al. "GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks." International Conference on Learning Representations, 2026.

Markdown

[Patwardhan et al. "GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/patwardhan2026iclr-gdpval/)

BibTeX

@inproceedings{patwardhan2026iclr-gdpval,
  title     = {{GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks}},
  author    = {Patwardhan, Tejal and Dias, Rachel and Proehl, Elizabeth and Kim, Grace and Wang, Michele and Watkins, Olivia and Fishman, Simon Posada and Aljubeh, Marwan and Thacker, Phoebe and Fauconnet, Laurance and Kim, Natalie S. and Miserendino, Samuel and Chabot, Gildas and Li, David and Chao, Patrick and Sharman, Michael and Barr, Alexandra and Glaese, Amelia and Tworek, Jerry},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/patwardhan2026iclr-gdpval/}
}