CHAMP: A Competition-Level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities

Abstract

Recent large language models (LLMs) have shown indications of math deduction abilities. However, it is unclear that for challenging math problems, what information about the problem helps (or hurts). In this paper, we propose a challenging benchmark dataset for such analyses. The Concept and Hint-Annotated Math Problems, or CHAMP, consists of competition-level math problems annotated with "concepts," or general math facts, and "hints," or problem-specific tricks. These entities and their interconnections allow us to explore the effects of additional information, such as relevant hints, misleading concepts, or related problems. We conduct 12 preliminary studies with 4 models, summarize our findings and discuss how CHAMP supports general discussions around LLMs' capabilities to understand and use contexts. The dataset, code and an extended version of the paper are available on the project website at https://yujunmao1.github.io/CHAMP.

Cite

Text

Mao et al. "CHAMP: A Competition-Level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities." NeurIPS 2023 Workshops: MATH-AI, 2023.

Markdown

[Mao et al. "CHAMP: A Competition-Level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities." NeurIPS 2023 Workshops: MATH-AI, 2023.](https://mlanthology.org/neuripsw/2023/mao2023neuripsw-champ/)

BibTeX

@inproceedings{mao2023neuripsw-champ,
  title     = {{CHAMP: A Competition-Level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities}},
  author    = {Mao, Yujun and Kim, Yoon and Zhou, Yilun},
  booktitle = {NeurIPS 2023 Workshops: MATH-AI},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/mao2023neuripsw-champ/}
}