Semi-meta-supervised hate speech detection

被引:3
|
作者
Putra, Cendra Devayana [1 ]
Wang, Hei-Chia [1 ,2 ]
机构
[1] Natl Cheng Kung Univ, Inst Informat Management, Tainan 701, Taiwan
[2] Natl Cheng Kung Univ, Ctr Innovat FinTech Business Models, Tainan 701, Taiwan
关键词
Semisupervised learning; Single -task learning; Hate speech; Shared knowledge;
D O I
10.1016/j.knosys.2024.111386
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
On social media, hate speech is a daily occurrence but has physical and psychological implications. Utilizing a deep learning strategy to combat hate speech is one method for preventing it. Deep learning techniques may require massive datasets to generate accurate models, but hate speech samples (such as misogyny and cyber samples) are frequently insufficient and diverse. We offer methods for leveraging these diverse datasets and enhancing deep learning models through knowledge sharing. We analyzed the existing Bidirectional Encoder Representations from Transformers (BERT) technique and built a BERT-3CNN method to generate a single -task classifier that optimally absorbs the target dataset's features. Second, we proposed a shared BERT layer to gain a general understanding of hate speech. Third, we proposed a method for adapting another dataset to the desired dataset. We conducted several quantitative experimental investigations on five datasets, including Hatebase, Supremacist, Cybertroll, TRAC, and TRAC 2020, and assessed the achieved performance using the accuracy and F1 metrics. The first experiment demonstrated that our BERT-3CNN model improved the average accuracy by 5% and the F1 score by 18%. The second experiment demonstrated that BERT -SP improved the average accuracy by 0.2% and the F1 score by 2%. TRAC, Supremacist, Hatebase, and Cybertroll all showed improvements in accuracy, with Semi BERT -SP enhancing accuracy by 6% and F1 score by 5%, while TRAC2020 showed 10% and 9% improvements.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Semi-Supervised Self-Learning for Arabic Hate Speech Detection
    Alsafari, Safa
    Sadaoui, Samira
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 863 - 868
  • [2] Multilingual Hate Speech Detection: A Semi-Supervised Generative Adversarial Approach
    Mnassri, Khouloud
    Farahbakhsh, Reza
    Crespi, Noel
    ENTROPY, 2024, 26 (04)
  • [3] Multilingual Hate Speech Detection Using Semi-supervised Generative Adversarial Network
    Mnassri, Khouloud
    Farahbakhsh, Reza
    Crespi, Noel
    COMPLEX NETWORKS & THEIR APPLICATIONS XII, VOL 4, COMPLEX NETWORKS 2023, 2024, 1144 : 192 - 204
  • [4] Automated Text Annotation Using a Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection
    Saifullah, Shoffan
    Drezewski, Rafal
    Dwiyanto, Felix Andika
    Aribowo, Agus Sasmito
    Fauziah, Yuli
    Cahyana, Nur Heri
    APPLIED SCIENCES-BASEL, 2024, 14 (03):
  • [5] MetaHate: A Meta-Model for Hate Speech Detection
    Kyrollos, Daniel G.
    Green, James R.
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2496 - 2502
  • [6] Semi-Supervised Self-Training of Hate and Offensive Speech from Social Media
    Alsafari, Safa
    Sadaoui, Samira
    APPLIED ARTIFICIAL INTELLIGENCE, 2021, 35 (15) : 1621 - 1645
  • [7] Self-supervised hate speech detection in Norwegian texts with lexical and semantic augmentations
    Hashmi, Ehtesham
    Yayilgan, Sule Yildirim
    Yamin, Muhammad Mudassar
    Abomhara, Mohamed
    Ullah, Mohib
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 264
  • [8] Hate Speech Detection in Clubhouse
    Mansourifar, Hadi
    Alsagheer, Dana
    Fathi, Reza
    Shi, Weidong
    Ni, Lan
    Huang, Yan
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2021, 1525 : 341 - 351
  • [9] Profanity and hate speech detection
    Teh, Phoey Lee
    Cheng, Chi-Bin
    International Journal of Information and Management Sciences, 2020, 31 (03): : 227 - 246
  • [10] Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection
    Awal, Md Rabiul
    Lee, Roy Ka-Wei
    Tanwar, Eshaan
    Garg, Tanmay
    Chakraborty, Tanmoy
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (01) : 1086 - 1095