Not Search, but Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning
Abstract
With the rapid progress of multimodal large language models (MLLMs), AI already performs well at literature retrieval and certain reasoning tasks, serving as a capable assistant to human researchers, yet it remains far from autonomous research. The fundamental reason is that current work on academic paper reasoning is largely confined to a search-oriented paradigm centered on pre-specified targets, with reasoning grounded in relevance retrieval, which struggles to support researcher-style full-document understanding, reasoning, and verification. To bridge this gap, we propose **ScholScan**, a new benchmark for academic paper reasoning. ScholScan introduces a scan-oriented task setting that asks models to read and cross-check entire papers like human researchers, scanning the document to identify consistency issues. The benchmark comprises 1,800 carefully annotated questions drawn from nine error categories across 13 natural-science domains and 715 papers, and provides detailed annotations for evidence localization and reasoning traces, together with a unified evaluation protocol. We assessed 15 models across 24 input configurations and conducted a fine-grained analysis of MLLM capabilities for all error categories. Across the board, retrieval-augmented generation (RAG) methods yield no significant improvements, revealing systematic deficiencies of current MLLMs on scan-oriented tasks and underscoring the challenge posed by ScholScan. We expect ScholScan to be the leading and representative work of the scan-oriented task paradigm.
Cite
Text
Li et al. "Not Search, but Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning." International Conference on Learning Representations, 2026.Markdown
[Li et al. "Not Search, but Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/li2026iclr-search/)BibTeX
@inproceedings{li2026iclr-search,
title = {{Not Search, but Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning}},
author = {Li, Rongjin and Tang, Zichen and Wang, Xianghe and Hu, Xinyi and Wang, Zhengyu and Lu, Zhengyu and Huang, Yiling and Chen, Jiayuan and Tan, Weisheng and Liu, Jiacheng and Yang, Zhongjun and E, Haihong},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/li2026iclr-search/}
}