Zum Inhalt springen

Is checkworthiness generalizable? Evaluating task and domain generalization of datasets for claim detection.

Author: Nenno, S.
Published in: Neural Computing & Applications
Year: 2024
Type: Academic articles
DOI: 10.1007/s00521-024-09896-4

The spread of misinformation has reached a level at which neither research nor fact-checkers can monitor it only manually anymore. Accordingly, there has been much research on models and datasets for detecting checkworthy claims. However, the research in NLP is mostly detached from findings in communication science on misinformation and fact-checking. Checkworthiness is a notoriously vague concept whose meaning is contested among different stakeholders. Against the background of news value theory, i.e., the study of factors that make an event relevant for journalistic reporting, this is not surprising. It is argued that this vagueness leads to inconsistencies and poor generalization across different datasets and domains. For the experiments, models are trained on one dataset, tested on the remaining, and evaluated against the results on the original performance, against a random baseline, and against the scores when the models are not trained at all. The study finds that there is a drastic reduction in comparison with the performance on the original dataset. Moreover, often the models are outperformed by the random baseline and training on one dataset has no or even a negative impact on the performance on the other datasets. This paper proposes that future research should abandon this task design and instead take inspiration from research in communication science. In the style of news values, Claim Detection should focus on factors that are relevant for fact-checkers and misinformation.

Visit publication
Download Publication


Connected HIIG researchers

Sami Nenno

Wissenschaftlicher Mitarbeiter: AI & Society Lab

Aktuelle HIIG-Aktivitäten entdecken

Forschungsthemen im Fokus

Das HIIG beschäftigt sich mit spannenden Themen. Erfahren Sie mehr über unsere interdisziplinäre Pionierarbeit im öffentlichen Diskurs.