Enhancing Misinformation Detection in Spanish Language with Deep Learning: BERT and RoBERTa Transformer Models

被引：0

作者：

Blanco-Fernandez, Yolanda ^{[1
]}

Otero-Vizoso, Javier ^{[2
]}

Gil-Solla, Alberto ^{[1
]}

Garcia-Duque, Jorge ^{[2
]}

机构：

[1] Univ Vigo, AtlanTTic Res Ctr Telecommun Technol, Vigo 36310, Spain

[2] Univ Vigo, Escuela Ingn Telecomunicac, Vigo, Spain

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 21期

关键词：

fake news; Spanish; curated synthetic dataset; fine-tuning; Transformer-based models; BERT; RoBERTa; FAKE NEWS;

D O I：

10.3390/app14219729

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

This paper presents an approach to identifying political fake news in Spanish using Transformer architectures. Current methodologies often overlook political news due to the lack of quality datasets, especially in Spanish. To address this, we created a synthetic dataset of 57,231 Spanish political news articles, gathered via automated web scraping and enhanced with generative large language models. This dataset is used for fine-tuning and benchmarking Transformer models like BERT and RoBERTa for fake news detection. Our fine-tuned models showed outstanding performance on this dataset, with accuracy ranging from 97.4% to 98.6%. However, testing with a smaller, independent hand-curated dataset, including statements from political leaders during Spain's July 2023 electoral debates, revealed a performance drop to 71%. Although this suggests that the model needs additional refinements to handle the complexity and variability of real-world political discourse, achieving over 70% accuracy seems a promising result in the under-explored domain of Spanish political fake news detection.

引用

页数：27

共 50 条

[41] Abusive Bangla comments detection on Facebook using transformer-based deep learning models
Tanjim Taharat Aurpa
Rifat Sadik
Md Shoaib Ahmed
Social Network Analysis and Mining, 2022, 12
[42] Abusive Bangla comments detection on Facebook using transformer-based deep learning models
Aurpa, Tanjim Taharat
Sadik, Rifat
Ahmed, Md Shoaib
SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
[43] Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection
Wahle, Jan Philip
Ashok, Nischal
Ruas, Terry
Meuschke, Norman
Ghosal, Tirthankar
Gipp, Bela
INFORMATION FOR A BETTER WORLD: SHAPING THE GLOBAL FUTURE, PT I, 2022, 13192 : 381 - 392
[44] Text Detection of Transformer Based on Deep Learning Algorithm
Cheng, Yu
Wan, Yiru
Sima, Yingjie
Zhang, Yinmei
Hu, Sanying
Wu, Shu
TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2022, 29 (03): : 861 - 866
[45] Object Detection Algorithms Based on Deep Learning and Transformer
Fu, Miaomiao
Deng, Miaolei
Zhang, Dexian
Computer Engineering and Applications, 2023, 59 (01): : 37 - 48
[46] Language Models Based on Deep Learning: A Review
Wang N.-Y.
Ye Y.-X.
Liu L.
Feng L.-Z.
Bao T.
Peng T.
Peng, Tao (tpeng@jlu.edu.cn), 1600, Chinese Academy of Sciences (32): : 1082 - 1115
[47] Few-Shot Learning for Misinformation Detection Based on Contrastive Models
Zheng, Peng
Chen, Hao
Hu, Shu
Zhu, Bin
Hu, Jinrong
Lin, Ching-Sheng
Wu, Xi
Lyu, Siwei
Huang, Guo
Wang, Xin
ELECTRONICS, 2024, 13 (04)
[48] Transformer-Based Language Models for Software Vulnerability Detection
Thapa, Chandra
Jang, Seung Ick
Ahmed, Muhammad Ejaz
Camtepe, Seyit
Pieprzyk, Josef
Nepal, Surya
PROCEEDINGS OF THE 38TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2022, 2022, : 481 - 496
[49] Transformer models for text-based emotion detection: a review of BERT-based approaches
Acheampong, Francisca Adoma
Nunoo-Mensah, Henry
Chen, Wenyu
ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) : 5789 - 5829
[50] Deep Learning and Web Applications Vulnerabilities Detection: An Approach Based on Large Language Models
Nana, Sidwendluian Romaric
Bassole, Didier
Guel, Desire
Sie, Oumarou
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 1391 - 1399

← 1 2 3 4 5 →