Multilingual Hate Speech Detection: A Semi-Supervised Generative Adversarial Approach

被引:2
|
作者
Mnassri, Khouloud [1 ]
Farahbakhsh, Reza [1 ]
Crespi, Noel [1 ]
机构
[1] Inst Polytech Paris, Samovar Telecom SudParis, F-91120 Palaiseau, France
关键词
social media; hate speech; semisupervised; GAN; multilingual; PLMs; DATA AUGMENTATION; NETWORKS;
D O I
10.3390/e26040344
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Social media platforms have surpassed cultural and linguistic boundaries, thus enabling online communication worldwide. However, the expanded use of various languages has intensified the challenge of online detection of hate speech content. Despite the release of multiple Natural Language Processing (NLP) solutions implementing cutting-edge machine learning techniques, the scarcity of data, especially labeled data, remains a considerable obstacle, which further requires the use of semisupervised approaches along with Generative Artificial Intelligence (Generative AI) techniques. This paper introduces an innovative approach, a multilingual semisupervised model combining Generative Adversarial Networks (GANs) and Pretrained Language Models (PLMs), more precisely mBERT and XLM-RoBERTa. Our approach proves its effectiveness in the detection of hate speech and offensive language in Indo-European languages (in English, German, and Hindi) when employing only 20% annotated data from the HASOC2019 dataset, thereby presenting significantly high performances in each of multilingual, zero-shot crosslingual, and monolingual training scenarios. Our study provides a robust mBERT-based semisupervised GAN model (SS-GAN-mBERT) that outperformed the XLM-RoBERTa-based model (SS-GAN-XLM) and reached an average F1 score boost of 9.23% and an accuracy increase of 5.75% over the baseline semisupervised mBERT model.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Optimization of semi-supervised generative adversarial network models: a survey
    Ma, Yongqing
    Zheng, Yifeng
    Zhang, Wenjie
    Wei, Baoya
    Lin, Ziqiong
    Liu, Weiqiang
    Li, Zhehan
    INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2024, 17 (04) : 705 - 736
  • [32] A semi-supervised approach to fault detection and diagnosis for building HVAC systems based on the modified generative adversarial network
    Li, Bingxu
    Cheng, Fanyong
    Cai, Hui
    Zhang, Xin
    Cai, Wenjian
    ENERGY AND BUILDINGS, 2021, 246
  • [33] A multilingual semi-supervised approach in deriving Singlish sentic patterns for polarity detection
    Lo, Siaw Ling
    Cambria, Erik
    Chiong, Raymond
    Cornforth, David
    KNOWLEDGE-BASED SYSTEMS, 2016, 105 : 236 - 247
  • [34] Semi-meta-supervised hate speech detection
    Putra, Cendra Devayana
    Wang, Hei-Chia
    KNOWLEDGE-BASED SYSTEMS, 2024, 287
  • [35] Automated Text Annotation Using a Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection
    Saifullah, Shoffan
    Drezewski, Rafal
    Dwiyanto, Felix Andika
    Aribowo, Agus Sasmito
    Fauziah, Yuli
    Cahyana, Nur Heri
    APPLIED SCIENCES-BASEL, 2024, 14 (03):
  • [36] Semi-supervised Malicious Traffic Detection with Improved Wasserstein Generative Adversarial Network with Gradient Penalty
    Wang, Jiafeng
    Liu, Ming
    Yin, Xiaokang
    Zhao, Yuhao
    Liu, Shengli
    2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 1916 - 1922
  • [37] Quantum semi-supervised generative adversarial network for enhanced data classification
    Nakaji, Kouhei
    Yamamoto, Naoki
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [38] Quantum semi-supervised generative adversarial network for enhanced data classification
    Kouhei Nakaji
    Naoki Yamamoto
    Scientific Reports, 11
  • [39] Pulsar candidate identification using semi-supervised generative adversarial networks
    Balakrishnan, Vishnu
    Champion, David
    Barr, Ewan
    Kramer, Michael
    Sengar, Rahul
    Bailes, Matthew
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2021, 505 (01) : 1180 - 1194
  • [40] A Semi-supervised Encoder Generative Adversarial Networks Model for Image Classification
    Fu, Xiao
    Shen, Yuan-Tong
    Li, Hong-Wei
    Cheng, Xiao-Mei
    Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (03): : 531 - 539