Inconsistent Reasoning Attacks to Identify Weaknesses in Automatic Scientific Claim Verification Tools
Abstract
Scientific Claim Verification (SCV) tools are essential for evaluating the validity of scientific assertions, particularly within autonomous science. However, they often struggle to interpret complex scientific language and detect reasoning flaws, leading to potential misclassification. Adversarial attacks, particularly paraphrase attacks, reveal these weaknesses by rewording claims while maintaining their meaning. Paraphrase attacks are not the only way to identify weaknesses in SCV tools, but other existing methods often fail to preserve semantic equivalence, requiring extensive human filtering. To address this, we define inconsistent reasoning attacks, a broader class of adversarial attack strategies that expose logical weaknesses in SCV systems. Using an evolutionary algorithm and large language models, this approach iteratively modifies claims to trigger misclassifications while maintaining logical inconsistencies. This method improves semantic accuracy and attack effectiveness, particularly for paraphrase-based attacks. Evaluation against a leading SCV system (MultiVerS) confirms persistent vulnerabilities, even though a retrieval-augmented generation (RAG) system with an Attack-Reflection mechanism shows potential in mitigating these issues. The findings emphasize the susceptibility of SCV systems to reasoning inconsistencies with a larger attack success rate than other attack techniques and highlight the Attack-Reflection mechanism as a promising defense.
Cite
Text
Islam et al. "Inconsistent Reasoning Attacks to Identify Weaknesses in Automatic Scientific Claim Verification Tools." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-032-06109-6_4Markdown
[Islam et al. "Inconsistent Reasoning Attacks to Identify Weaknesses in Automatic Scientific Claim Verification Tools." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/islam2025ecmlpkdd-inconsistent/) doi:10.1007/978-3-032-06109-6_4BibTeX
@inproceedings{islam2025ecmlpkdd-inconsistent,
title = {{Inconsistent Reasoning Attacks to Identify Weaknesses in Automatic Scientific Claim Verification Tools}},
author = {Islam, Md Athikul and Ellison, Noel and Lakha, Bishal and Serra, Edoardo},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2025},
pages = {56-73},
doi = {10.1007/978-3-032-06109-6_4},
url = {https://mlanthology.org/ecmlpkdd/2025/islam2025ecmlpkdd-inconsistent/}
}