Enhancing Misinformation Detection in Spanish Language with Deep Learning: BERT and RoBERTa Transformer Models

被引:0
|
作者
Blanco-Fernandez, Yolanda [1 ]
Otero-Vizoso, Javier [2 ]
Gil-Solla, Alberto [1 ]
Garcia-Duque, Jorge [2 ]
机构
[1] Univ Vigo, AtlanTTic Res Ctr Telecommun Technol, Vigo 36310, Spain
[2] Univ Vigo, Escuela Ingn Telecomunicac, Vigo, Spain
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 21期
关键词
fake news; Spanish; curated synthetic dataset; fine-tuning; Transformer-based models; BERT; RoBERTa; FAKE NEWS;
D O I
10.3390/app14219729
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
This paper presents an approach to identifying political fake news in Spanish using Transformer architectures. Current methodologies often overlook political news due to the lack of quality datasets, especially in Spanish. To address this, we created a synthetic dataset of 57,231 Spanish political news articles, gathered via automated web scraping and enhanced with generative large language models. This dataset is used for fine-tuning and benchmarking Transformer models like BERT and RoBERTa for fake news detection. Our fine-tuned models showed outstanding performance on this dataset, with accuracy ranging from 97.4% to 98.6%. However, testing with a smaller, independent hand-curated dataset, including statements from political leaders during Spain's July 2023 electoral debates, revealed a performance drop to 71%. Although this suggests that the model needs additional refinements to handle the complexity and variability of real-world political discourse, achieving over 70% accuracy seems a promising result in the under-explored domain of Spanish political fake news detection.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] Learning Deep Transformer Models for Machine Translation
    Wang, Qiang
    Li, Bei
    Xiao, Tong
    Zhu, Jingbo
    Li, Changliang
    Wong, Derek F.
    Chao, Lidia S.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1810 - 1822
  • [22] A Hybrid Deep Learning Architecture for Misinformation Detection on Social Media
    Alzahrani, Amani
    Baabdullah, Tahani
    Almotairi, Aeman
    Rawat, Danda B.
    2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 199 - 204
  • [23] Assessing the impact of differential privacy in transfer learning with deep neural networks and transformer language models
    Samuel Sousa
    Andreas Trügler
    Roman Kern
    Neural Computing and Applications, 2025, 37 (6) : 5097 - 5119
  • [24] Ensemble-Based Deep Learning Models for Enhancing IoT Intrusion Detection
    Odeh, Ammar
    Abu Taleb, Anas
    APPLIED SCIENCES-BASEL, 2023, 13 (21):
  • [25] Leveraging Large Language Models and BERT for Log Parsing and Anomaly Detection
    Zhou, Yihan
    Chen, Yan
    Rao, Xuanming
    Zhou, Yukang
    Li, Yuxin
    Hu, Chao
    MATHEMATICS, 2024, 12 (17)
  • [26] Enhancing the identification accuracy of deep learning object detection using natural language processing
    Ming-Fong Tsai
    Hung-Ju Tseng
    The Journal of Supercomputing, 2021, 77 : 6676 - 6691
  • [27] Enhancing the identification accuracy of deep learning object detection using natural language processing
    Tsai, Ming-Fong
    Tseng, Hung-Ju
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (07): : 6676 - 6691
  • [28] Advanced deep learning and large language models: Comprehensive insights for cancer detection
    Habchi, Yassine
    Kheddar, Hamza
    Himeur, Yassine
    Belouchrani, Adel
    Serpedin, Erchin
    Khelifi, Fouad
    Chowdhury, Muhammad E. H.
    IMAGE AND VISION COMPUTING, 2025, 157
  • [29] Enhancing Healthcare Predictions with Deep Learning Models
    Baji, Adam
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23729 - 23730
  • [30] Deep Learning Techniques for Spanish Sign Language Interpretation
    Martinez-Martin, Ester
    Morillas-Espejo, Francisco
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021