AVeriTeC: A Dataset for Real-World Claim Verification with Evidence from the Web

Abstract

Existing datasets for automated fact-checking have substantial limitations, such as relying on artificial claims, lacking annotations for evidence and intermediate reasoning, or including evidence published after the claim. In this paper we introduce AVeriTeC, a new dataset of 4,568 real-world claims covering fact-checks by 50 different organizations. Each claim is annotated with question-answer pairs supported by evidence available online, as well as textual justifications explaining how the evidence combines to produce a verdict. Through a multi-round annotation process, we avoid common pitfalls including context dependence, evidence insufficiency, and temporal leakage, and reach a substantial inter-annotator agreement of $\kappa=0.619$ on verdicts. We develop a baseline as well as an evaluation scheme for verifying claims through question-answering against the open web.

Cite

Text

Schlichtkrull et al. "AVeriTeC: A Dataset for Real-World Claim Verification with Evidence from the Web." Neural Information Processing Systems, 2023.

Markdown

[Schlichtkrull et al. "AVeriTeC: A Dataset for Real-World Claim Verification with Evidence from the Web." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/schlichtkrull2023neurips-averitec/)

BibTeX

@inproceedings{schlichtkrull2023neurips-averitec,
  title     = {{AVeriTeC: A Dataset for Real-World Claim Verification with Evidence from the Web}},
  author    = {Schlichtkrull, Michael and Guo, Zhijiang and Vlachos, Andreas},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/schlichtkrull2023neurips-averitec/}
}