Inconsistent Reasoning Attacks to Identify Weaknesses in Automatic Scientific Claim Verification Tools

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Scientific Claim Verification (SCV) tools are essential for evaluating the validity of scientific assertions, particularly within autonomous science. However, they often struggle to interpret complex scientific language and detect reasoning flaws, leading to potential misclassification. Adversarial attacks, particularly paraphrase attacks, reveal these weaknesses by rewording claims while maintaining their meaning. Paraphrase attacks are not the only way to identify weaknesses in SCV tools, but other existing methods often fail to preserve semantic equivalence, requiring extensive human filtering. To address this, we define inconsistent reasoning attacks, a broader class of adversarial attack strategies that expose logical weaknesses in SCV systems. Using an evolutionary algorithm and large language models, this approach iteratively modifies claims to trigger misclassifications while maintaining logical inconsistencies. This method improves semantic accuracy and attack effectiveness, particularly for paraphrase-based attacks. Evaluation against a leading SCV system (MultiVerS) confirms persistent vulnerabilities, even though a retrieval-augmented generation (RAG) system with an Attack-Reflection mechanism shows potential in mitigating these issues. The findings emphasize the susceptibility of SCV systems to reasoning inconsistencies with a larger attack success rate than other attack techniques and highlight the Attack-Reflection mechanism as a promising defense.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Proceedings
EditorsRita P. Ribeiro, Carlos Soares, João Gama, Bernhard Pfahringer, Nathalie Japkowicz, Pedro Larrañaga, Alípio M. Jorge, Pedro H. Abreu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages56-73
Number of pages18
ISBN (Print)9783032061089
DOIs
StatePublished - 2026
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025 - Porto, Portugal
Duration: 15 Sep 202519 Sep 2025

Publication series

NameLecture Notes in Computer Science
Volume16019 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025
Country/TerritoryPortugal
CityPorto
Period15/09/2519/09/25

Keywords

  • Adversarial Attacks
  • Automatic Scientific Claim
  • Large Language Models
  • Robustness
  • Verification Tools

Fingerprint

Dive into the research topics of 'Inconsistent Reasoning Attacks to Identify Weaknesses in Automatic Scientific Claim Verification Tools'. Together they form a unique fingerprint.

Cite this