Multilingual Hate Speech Detection: A Semi-Supervised Generative Adversarial Approach

被引:2
|
作者
Mnassri, Khouloud [1 ]
Farahbakhsh, Reza [1 ]
Crespi, Noel [1 ]
机构
[1] Inst Polytech Paris, Samovar Telecom SudParis, F-91120 Palaiseau, France
关键词
social media; hate speech; semisupervised; GAN; multilingual; PLMs; DATA AUGMENTATION; NETWORKS;
D O I
10.3390/e26040344
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Social media platforms have surpassed cultural and linguistic boundaries, thus enabling online communication worldwide. However, the expanded use of various languages has intensified the challenge of online detection of hate speech content. Despite the release of multiple Natural Language Processing (NLP) solutions implementing cutting-edge machine learning techniques, the scarcity of data, especially labeled data, remains a considerable obstacle, which further requires the use of semisupervised approaches along with Generative Artificial Intelligence (Generative AI) techniques. This paper introduces an innovative approach, a multilingual semisupervised model combining Generative Adversarial Networks (GANs) and Pretrained Language Models (PLMs), more precisely mBERT and XLM-RoBERTa. Our approach proves its effectiveness in the detection of hate speech and offensive language in Indo-European languages (in English, German, and Hindi) when employing only 20% annotated data from the HASOC2019 dataset, thereby presenting significantly high performances in each of multilingual, zero-shot crosslingual, and monolingual training scenarios. Our study provides a robust mBERT-based semisupervised GAN model (SS-GAN-mBERT) that outperformed the XLM-RoBERTa-based model (SS-GAN-XLM) and reached an average F1 score boost of 9.23% and an accuracy increase of 5.75% over the baseline semisupervised mBERT model.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] SEMI-SUPERVISED LEARNING WITH GENERATIVE ADVERSARIAL NETWORKS FOR ARABIC DIALECT IDENTIFICATION
    Zhang, Chunlei
    Zhang, Qian
    Hansen, John H. L.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5986 - 5990
  • [42] A semi-supervised image segmentation method based on generative adversarial network
    Nie, Wei
    Gou, Peng
    Liu, Yang
    Zhou, Tianyu
    Xu, Nuo
    Wang, Peng
    Du, QiQi
    IEEE Joint International Information Technology and Artificial Intelligence Conference (ITAIC), 2022, 2022-June : 1217 - 1223
  • [43] A SEMI-SUPERVISED GENERATIVE ADVERSARIAL NETWORK FOR PREDICTION OF GENETIC DISEASE OUTCOMES
    Davi, Caio
    Braga-Neto, Ulisses
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [44] Localizing Microseismic Events Using Semi-Supervised Generative Adversarial Networks
    Feng, Qiang
    Han, Liguo
    Zhao, Binghui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [45] Semi-supervised Multi-category Classification with Generative Adversarial Networks
    Rastogi, Reshma
    Gangnani, Ritesh
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2019, PT I, 2019, 11941 : 286 - 294
  • [46] SEMI-SUPERVISED VARIATIONAL GENERATIVE ADVERSARIAL NETWORKS FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Wang, Hao
    Tao, Chao
    Qi, Ji
    Li, HaiFeng
    Tang, YuQi
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 9792 - 9794
  • [47] Semi-supervised semantic segmentation using an improved generative adversarial network
    Xu, Di
    Wang, Zhili
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (05) : 9709 - 9719
  • [48] Semi-supervised image attribute editing using generative adversarial networks
    Dogan, Yahya
    Keles, Hacer Yalim
    NEUROCOMPUTING, 2020, 401 (401) : 338 - 352
  • [49] Semi-supervised convolutional generative adversarial network for hyperspectral image classification
    Xue, Zhixiang
    IET IMAGE PROCESSING, 2020, 14 (04) : 709 - 719
  • [50] Discriminative Regularization with Conditional Generative Adversarial Nets for Semi-Supervised Learning
    Xie, Qiangian
    Peng, Min
    Huang, Jimin
    Wang, Bin
    Wang, Hua
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,