Enhancing Misinformation Detection in Spanish Language with Deep Learning: BERT and RoBERTa Transformer Models

被引:0
|
作者
Blanco-Fernandez, Yolanda [1 ]
Otero-Vizoso, Javier [2 ]
Gil-Solla, Alberto [1 ]
Garcia-Duque, Jorge [2 ]
机构
[1] Univ Vigo, AtlanTTic Res Ctr Telecommun Technol, Vigo 36310, Spain
[2] Univ Vigo, Escuela Ingn Telecomunicac, Vigo, Spain
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 21期
关键词
fake news; Spanish; curated synthetic dataset; fine-tuning; Transformer-based models; BERT; RoBERTa; FAKE NEWS;
D O I
10.3390/app14219729
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
This paper presents an approach to identifying political fake news in Spanish using Transformer architectures. Current methodologies often overlook political news due to the lack of quality datasets, especially in Spanish. To address this, we created a synthetic dataset of 57,231 Spanish political news articles, gathered via automated web scraping and enhanced with generative large language models. This dataset is used for fine-tuning and benchmarking Transformer models like BERT and RoBERTa for fake news detection. Our fine-tuned models showed outstanding performance on this dataset, with accuracy ranging from 97.4% to 98.6%. However, testing with a smaller, independent hand-curated dataset, including statements from political leaders during Spain's July 2023 electoral debates, revealed a performance drop to 71%. Although this suggests that the model needs additional refinements to handle the complexity and variability of real-world political discourse, achieving over 70% accuracy seems a promising result in the under-explored domain of Spanish political fake news detection.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] Abusive Bangla comments detection on Facebook using transformer-based deep learning models
    Tanjim Taharat Aurpa
    Rifat Sadik
    Md Shoaib Ahmed
    Social Network Analysis and Mining, 2022, 12
  • [42] Abusive Bangla comments detection on Facebook using transformer-based deep learning models
    Aurpa, Tanjim Taharat
    Sadik, Rifat
    Ahmed, Md Shoaib
    SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
  • [43] Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection
    Wahle, Jan Philip
    Ashok, Nischal
    Ruas, Terry
    Meuschke, Norman
    Ghosal, Tirthankar
    Gipp, Bela
    INFORMATION FOR A BETTER WORLD: SHAPING THE GLOBAL FUTURE, PT I, 2022, 13192 : 381 - 392
  • [44] Text Detection of Transformer Based on Deep Learning Algorithm
    Cheng, Yu
    Wan, Yiru
    Sima, Yingjie
    Zhang, Yinmei
    Hu, Sanying
    Wu, Shu
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2022, 29 (03): : 861 - 866
  • [45] Object Detection Algorithms Based on Deep Learning and Transformer
    Fu, Miaomiao
    Deng, Miaolei
    Zhang, Dexian
    Computer Engineering and Applications, 2023, 59 (01): : 37 - 48
  • [46] Language Models Based on Deep Learning: A Review
    Wang N.-Y.
    Ye Y.-X.
    Liu L.
    Feng L.-Z.
    Bao T.
    Peng T.
    Peng, Tao (tpeng@jlu.edu.cn), 1600, Chinese Academy of Sciences (32): : 1082 - 1115
  • [47] Few-Shot Learning for Misinformation Detection Based on Contrastive Models
    Zheng, Peng
    Chen, Hao
    Hu, Shu
    Zhu, Bin
    Hu, Jinrong
    Lin, Ching-Sheng
    Wu, Xi
    Lyu, Siwei
    Huang, Guo
    Wang, Xin
    ELECTRONICS, 2024, 13 (04)
  • [48] Transformer-Based Language Models for Software Vulnerability Detection
    Thapa, Chandra
    Jang, Seung Ick
    Ahmed, Muhammad Ejaz
    Camtepe, Seyit
    Pieprzyk, Josef
    Nepal, Surya
    PROCEEDINGS OF THE 38TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2022, 2022, : 481 - 496
  • [49] Transformer models for text-based emotion detection: a review of BERT-based approaches
    Acheampong, Francisca Adoma
    Nunoo-Mensah, Henry
    Chen, Wenyu
    ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) : 5789 - 5829
  • [50] Deep Learning and Web Applications Vulnerabilities Detection: An Approach Based on Large Language Models
    Nana, Sidwendluian Romaric
    Bassole, Didier
    Guel, Desire
    Sie, Oumarou
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 1391 - 1399