TY - GEN
T1 - Inconsistent Reasoning Attacks to Identify Weaknesses in Automatic Scientific Claim Verification Tools
AU - Islam, Md Athikul
AU - Ellison, Noel
AU - Lakha, Bishal
AU - Serra, Edoardo
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Scientific Claim Verification (SCV) tools are essential for evaluating the validity of scientific assertions, particularly within autonomous science. However, they often struggle to interpret complex scientific language and detect reasoning flaws, leading to potential misclassification. Adversarial attacks, particularly paraphrase attacks, reveal these weaknesses by rewording claims while maintaining their meaning. Paraphrase attacks are not the only way to identify weaknesses in SCV tools, but other existing methods often fail to preserve semantic equivalence, requiring extensive human filtering. To address this, we define inconsistent reasoning attacks, a broader class of adversarial attack strategies that expose logical weaknesses in SCV systems. Using an evolutionary algorithm and large language models, this approach iteratively modifies claims to trigger misclassifications while maintaining logical inconsistencies. This method improves semantic accuracy and attack effectiveness, particularly for paraphrase-based attacks. Evaluation against a leading SCV system (MultiVerS) confirms persistent vulnerabilities, even though a retrieval-augmented generation (RAG) system with an Attack-Reflection mechanism shows potential in mitigating these issues. The findings emphasize the susceptibility of SCV systems to reasoning inconsistencies with a larger attack success rate than other attack techniques and highlight the Attack-Reflection mechanism as a promising defense.
AB - Scientific Claim Verification (SCV) tools are essential for evaluating the validity of scientific assertions, particularly within autonomous science. However, they often struggle to interpret complex scientific language and detect reasoning flaws, leading to potential misclassification. Adversarial attacks, particularly paraphrase attacks, reveal these weaknesses by rewording claims while maintaining their meaning. Paraphrase attacks are not the only way to identify weaknesses in SCV tools, but other existing methods often fail to preserve semantic equivalence, requiring extensive human filtering. To address this, we define inconsistent reasoning attacks, a broader class of adversarial attack strategies that expose logical weaknesses in SCV systems. Using an evolutionary algorithm and large language models, this approach iteratively modifies claims to trigger misclassifications while maintaining logical inconsistencies. This method improves semantic accuracy and attack effectiveness, particularly for paraphrase-based attacks. Evaluation against a leading SCV system (MultiVerS) confirms persistent vulnerabilities, even though a retrieval-augmented generation (RAG) system with an Attack-Reflection mechanism shows potential in mitigating these issues. The findings emphasize the susceptibility of SCV systems to reasoning inconsistencies with a larger attack success rate than other attack techniques and highlight the Attack-Reflection mechanism as a promising defense.
KW - Adversarial Attacks
KW - Automatic Scientific Claim
KW - Large Language Models
KW - Robustness
KW - Verification Tools
UR - https://www.scopus.com/pages/publications/105019533526
U2 - 10.1007/978-3-032-06109-6_4
DO - 10.1007/978-3-032-06109-6_4
M3 - Conference contribution
AN - SCOPUS:105019533526
SN - 9783032061089
T3 - Lecture Notes in Computer Science
SP - 56
EP - 73
BT - Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Proceedings
A2 - Ribeiro, Rita P.
A2 - Soares, Carlos
A2 - Gama, João
A2 - Pfahringer, Bernhard
A2 - Japkowicz, Nathalie
A2 - Larrañaga, Pedro
A2 - Jorge, Alípio M.
A2 - Abreu, Pedro H.
PB - Springer Science and Business Media Deutschland GmbH
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025
Y2 - 15 September 2025 through 19 September 2025
ER -