SCI-Verifier: Scientific Verifier with Thinking
Abstract
As large language models (LLMs) are increasingly applied to scientific reasoning, the complexity of answer formats and the diversity of equivalent expressions make answer verification a critical yet challenging task. Existing verification studies in scientific domains suffer from two major limitations: (a) the absence of systematic evaluation standards and insufficient disciplinary coverage, which hinders their comprehensive assessment; and (b) heavy reliance on cumbersome rule design or prompt engineering, which reduces their effectiveness in complex reasoning scenarios or limits their cross-disciplinary generalization. To address these challenges, we propose solutions at both the data and model levels. On the data side, we construct **SCI-VerifyBench**, a cross-disciplinary benchmark covering mathematics, physics, biology, chemistry, and general scientific QA. The benchmark is built from real LLM responses and enhanced with domain-specific equivalence transformations that generate challenging and realistic data. Model-based and expert annotations ensure both quality and diversity, enabling rigorous evaluation of verification ability. On the model side, we emphasize the importance of reasoning for verification and introduce **SCI-Verifier**, a unified reasoning-augmented verifier for scientific domains. Through post-training, SCI-Verifier demonstrates strong logical reasoning and equivalence judgment capabilities while maintaining concise and stable outputs. Together, SCI-VerifyBench and SCI-Verifier provide a principled framework for scientific verification, offering both systematic evaluation and practical pathways to enhance the reliability and applicability of LLMs in scientific domains.
Cite
Text
Zheng et al. "SCI-Verifier: Scientific Verifier with Thinking." International Conference on Learning Representations, 2026.Markdown
[Zheng et al. "SCI-Verifier: Scientific Verifier with Thinking." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zheng2026iclr-sciverifier/)BibTeX
@inproceedings{zheng2026iclr-sciverifier,
title = {{SCI-Verifier: Scientific Verifier with Thinking}},
author = {Zheng, Shenghe and Huang, Chenyu and Yu, Fangchen and Yao, Junchi and Ye, Jingqi and Chen, Tao and Luo, Yun and Ding, Ning and Bai, Lei and Cui, Ganqu and Ye, Peng},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/zheng2026iclr-sciverifier/}
}