Semi-meta-supervised hate speech detection

被引：3

作者：

Putra, Cendra Devayana ^{[1
]}

Wang, Hei-Chia ^{[1
,2
]}

机构：

[1] Natl Cheng Kung Univ, Inst Informat Management, Tainan 701, Taiwan

[2] Natl Cheng Kung Univ, Ctr Innovat FinTech Business Models, Tainan 701, Taiwan

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 287卷

关键词：

Semisupervised learning; Single -task learning; Hate speech; Shared knowledge;

D O I：

10.1016/j.knosys.2024.111386

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

On social media, hate speech is a daily occurrence but has physical and psychological implications. Utilizing a deep learning strategy to combat hate speech is one method for preventing it. Deep learning techniques may require massive datasets to generate accurate models, but hate speech samples (such as misogyny and cyber samples) are frequently insufficient and diverse. We offer methods for leveraging these diverse datasets and enhancing deep learning models through knowledge sharing. We analyzed the existing Bidirectional Encoder Representations from Transformers (BERT) technique and built a BERT-3CNN method to generate a single -task classifier that optimally absorbs the target dataset's features. Second, we proposed a shared BERT layer to gain a general understanding of hate speech. Third, we proposed a method for adapting another dataset to the desired dataset. We conducted several quantitative experimental investigations on five datasets, including Hatebase, Supremacist, Cybertroll, TRAC, and TRAC 2020, and assessed the achieved performance using the accuracy and F1 metrics. The first experiment demonstrated that our BERT-3CNN model improved the average accuracy by 5% and the F1 score by 18%. The second experiment demonstrated that BERT -SP improved the average accuracy by 0.2% and the F1 score by 2%. TRAC, Supremacist, Hatebase, and Cybertroll all showed improvements in accuracy, with Semi BERT -SP enhancing accuracy by 6% and F1 score by 5%, while TRAC2020 showed 10% and 9% improvements.

引用

页数：16

共 50 条

[21] Topic Oriented Hate Speech Detection
Jamil, Raihan
Khan, Mohammad Abdullah Al Nayeem
Anwar, Md Musfique
HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 365 - 375
[22] Constructing ensembles for hate speech detection
Kucukkaya, Izzet Emre
Toraman, Cagri
NATURAL LANGUAGE PROCESSING, 2024,
[23] Hate Speech Detection with Comment Embeddings
Djuric, Nemanja
Zhou, Jing
Morris, Robin
Grbovic, Mihajlo
Radosavljevic, Vladan
Bhamidipati, Narayan
WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 29 - 30
[24] Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model
Saleh, Hind
Alhothali, Areej
Moria, Kawthar
APPLIED ARTIFICIAL INTELLIGENCE, 2023, 37 (01)
[25] Supervised Classifiers to Identify Hate Speech on English and Spanish Tweets
Almatarneh, Sattam
Gamallo, Pablo
Ribadas Pena, Francisco J.
Alexeev, Alexey
DIGITAL LIBRARIES AT THE CROSSROADS OF DIGITAL INFORMATION FOR THE FUTURE, ICADL 2019, 2019, 11853 : 23 - 30
[26] A Federated Approach for Hate Speech Detection
Gala, Jay
Gandhi, Deep
Mehta, Jash
Talat, Zeerak
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3248 - 3259
[27] Levantine hate speech detection in twitter
AbdelHamid, Medyan
Jafar, Assef
Rahal, Yasser
SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
[28] Hate Speech Detection for the Power Domain
Huang, Qingbao
Deng, Zehua
Chen, Shizhen
Chen, Yifei
Shuang, Feng
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT IV, NLPCC 2024, 2025, 15362 : 333 - 345
[29] Hate Speech Detection in Roman Urdu
Khan, Muhammad Moin
Shahzad, Khurram
Malik, Muhammad Kamran
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (01)
[30] Semi-supervised speech activity detection with an application to automatic speaker verification
Sholokhov, Alexey
Sahidullah, Md
Kinnunen, Tomi
COMPUTER SPEECH AND LANGUAGE, 2018, 47 : 132 - 156

← 1 2 3 4 5 →