CiteME: Can Language Models Accurately Cite Scientific Claims?

Abstract

Thousands of new scientific papers are published each month. Such information overload complicates researcher efforts to stay current with the state-of-the-art as well as to verify and correctly attribute claims. We pose the following research question: Given a text excerpt referencing a paper, could an LM act as a research assistant to correctly identify the referenced paper? We advance efforts to answer this question by building a benchmark that evaluates the abilities of LMs in citation attribution. Our benchmark, CiteME, consists of text excerpts from recent machine learning papers, each referencing a single other paper. CiteME use reveals a large gap between frontier LMs and human performance, with LMs achieving only 4.2-18.5% accuracy and humans 69.7%. We close this gap by introducing CiteAgent, an autonomous system built on the GPT-4o LM that can also search and read papers, which achieves an accuracy of 35.3% on CiteME. Overall, CiteME serves as a challenging testbed for open-ended claim attribution, driving the research community towards a future where any claim made by an LM can be automatically verified and discarded if found to be incorrect.

Cite

Text

Press et al. "CiteME: Can Language Models Accurately Cite Scientific Claims?." Neural Information Processing Systems, 2024. doi:10.52202/079017-0252

Markdown

[Press et al. "CiteME: Can Language Models Accurately Cite Scientific Claims?." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/press2024neurips-citeme/) doi:10.52202/079017-0252

BibTeX

@inproceedings{press2024neurips-citeme,
  title     = {{CiteME: Can Language Models Accurately Cite Scientific Claims?}},
  author    = {Press, Ori and Hochlehnert, Andreas and Prabhu, Ameya and Udandarao, Vishaal and Press, Ofir and Bethge, Matthias},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-0252},
  url       = {https://mlanthology.org/neurips/2024/press2024neurips-citeme/}
}