Multilingual Hate Speech Detection Using Semi-supervised Generative Adversarial Network

被引:1
|
作者
Mnassri, Khouloud [1 ]
Farahbakhsh, Reza [1 ]
Crespi, Noel [1 ]
机构
[1] Inst Polytech Paris, Samovar, Telecom SudParis, F-91120 Palaiseau, France
关键词
Hate Speech; offensive language; semi-supervised; GAN; mBERT; multilingual; social media;
D O I
10.1007/978-3-031-53503-1_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online communication has overcome linguistic and cultural barriers, enabling global connection through social media platforms. However, linguistic variety introduced more challenges in tasks such as the detection of hate speech content. Although multiple NLP solutions were proposed using advanced machine learning techniques, data annotation scarcity is still a serious problem urging the need for employing semi-supervised approaches. This paper proposes an innovative solution-a multilingual Semi-Supervised model based on Generative Adversarial Networks (GAN) and mBERT models, namely SS-GAN-mBERT. We managed to detect hate speech in Indo-European languages (in English, German, and Hindi) using only 20% labeled data from the HASOC2019 dataset. Our approach excelled in multilingual, zero-shot cross-lingual, and monolingual paradigms, achieving, on average, a 9.23% F1 score boost and 5.75% accuracy increase over baseline mBERT model.
引用
收藏
页码:192 / 204
页数:13
相关论文
共 50 条
  • [21] Quantum semi-supervised generative adversarial network for enhanced data classification
    Nakaji, Kouhei
    Yamamoto, Naoki
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [22] Quantum semi-supervised generative adversarial network for enhanced data classification
    Kouhei Nakaji
    Naoki Yamamoto
    Scientific Reports, 11
  • [23] A semi-supervised image segmentation method based on generative adversarial network
    Nie, Wei
    Gou, Peng
    Liu, Yang
    Zhou, Tianyu
    Xu, Nuo
    Wang, Peng
    Du, QiQi
    IEEE Joint International Information Technology and Artificial Intelligence Conference (ITAIC), 2022, 2022-June : 1217 - 1223
  • [24] A SEMI-SUPERVISED GENERATIVE ADVERSARIAL NETWORK FOR PREDICTION OF GENETIC DISEASE OUTCOMES
    Davi, Caio
    Braga-Neto, Ulisses
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [25] Semi-supervised convolutional generative adversarial network for hyperspectral image classification
    Xue, Zhixiang
    IET IMAGE PROCESSING, 2020, 14 (04) : 709 - 719
  • [26] SEMI-SUPERVISED OBJECT DETECTION IN REMOTE SENSING IMAGES USING GENERATIVE ADVERSARIAL NETWORKS
    Chen, Guowei
    Liu, Lei
    Hu, Wenlong
    Pan, Zongxu
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 2503 - 2506
  • [27] Semi-Supervised Self-Learning for Arabic Hate Speech Detection
    Alsafari, Safa
    Sadaoui, Samira
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 863 - 868
  • [28] Semi-supervised community detection method based on generative adversarial networks
    Liu, Xiaoyang
    Zhang, Mengyao
    Liu, Yanfei
    Liu, Chao
    Li, Chaorong
    Wang, Wei
    Zhang, Xiaoqin
    Bouyer, Asgarali
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (03)
  • [29] SEMI-SUPERVISED CHANGE DETECTION BASED ON GRAPHS WITH GENERATIVE ADVERSARIAL NETWORKS
    Liu, Junfu
    Chen, Keming
    Xu, Guangluan
    Li, Hao
    Yan, Menglong
    Diao, Wenhui
    Sun, Xian
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 74 - 77
  • [30] Semi-supervised Malicious Traffic Detection with Improved Wasserstein Generative Adversarial Network with Gradient Penalty
    Wang, Jiafeng
    Liu, Ming
    Yin, Xiaokang
    Zhao, Yuhao
    Liu, Shengli
    2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 1916 - 1922