Detecting Arabic Offensive Language in Microblogs Using Domain-Specific Word Embeddings and Deep Learning

被引：3

作者：

Aljuhani, Khulood O. ^{[1
]}

Alyoubi, Khaled H. ^{[1
]}

Alotaibi, Fahd S. ^{[1
]}

机构：

[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Informat Syst Dept, Jeddah, Saudi Arabia

来源：

TEHNICKI GLASNIK-TECHNICAL JOURNAL | 2022年 / 16卷 / 03期

关键词：

Arabic Natural Language Processing; Arabic Tweets; Offensive Language Detection; Offensive Language; Word Embeddings;

D O I：

10.31803/tg-20220305120018

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

In recent years, social media networks are emerging as a key player by providing platforms for opinions expression, communication, and content distribution. However, users often take advantage of perceived anonymity on social media platforms to share offensive or hateful content. Thus, offensive language has grown as a significant issue with the increase in online communication and the popularity of social media platforms. This problem has attracted significant attention for devising methods for detecting offensive content and preventing its spread on online social networks. Therefore, this paper aims to develop an effective Arabic offensive language detection model by employing deep learning and semantic and contextual features. This paper proposes a deep learning approach that utilizes the bidirectional long short-term memory (BiLSTM) model and domain-specific word embeddings extracted from an Arabic offensive dataset. The detection approach was evaluated on an Arabic dataset collected from Twitter. The results showed the highest performance accuracy of 0.93% with the BiLSTM model trained using a combination of domain-specific and agnostic-domain word embeddings.

引用

页码：394 / 400

页数：7

共 50 条

[1] Lifelong Learning of Topics and Domain-Specific Word Embeddings
Qin, Xiaorui
Lu, Yuyin
Chen, Yufu
Rao, Yanghui
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2294 - 2309
[2] Evaluation of Domain-specific Word Embeddings using Knowledge Resources
Nooralahzadeh, Farhad
Ovrelid, Lilja
Lonning, Jan Tore
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1438 - 1445
[3] Expansion of domain-specific opinion lexicons using word embeddings
Lopez Solaz, Tomas
Cruz, Fermin L.
Enriquez, Fernando
PROCESAMIENTO DEL LENGUAJE NATURAL, 2016, (57): : 49 - 56
[4] Domain-specific word embeddings for patent classification
Risch, Julian
Krestel, Ralf
DATA TECHNOLOGIES AND APPLICATIONS, 2019, 53 (01) : 108 - 122
[5] Domain-Specific Word Embeddings with Structure Prediction
Lassner, David
Brandl, Stephanie
Baillot, Anne
Nakajima, Shinichi
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 320 - 335
[6] Detecting Domain-specific Ambiguities: an NLP Approach based on Wikipedia Crawling and Word Embeddings
Ferrari, Alessio
Donati, Beatrice
Gnesi, Stefania
2017 IEEE 25TH INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW), 2017, : 393 - 399
[7] Learning Domain-Specific Word Embeddings from COVID-19 Tweets
Aigbe, Steve Aibuedefe
Eick, Christoph
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4307 - 4312
[8] Arabic Quran Verses Authentication Using Deep Learning and Word Embeddings
Touati-Hamad, Zineb
Laouar, Mohamed Ridda
Bendib, Issam
Hakak, Saqib
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (04) : 681 - 688
[9] Offensive Language Detection of Arabic Tweets Using Deep Learning Algorithm
AlSukhni, Emad
AlAzzam, Iyad
Hanandeh, Sereen
2024 15TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS, ICICS 2024, 2024,
[10] Application-specific word embeddings for hate and offensive language detection
Claver P. Soto
Gustavo M. S. Nunes
José Gabriel R. C. Gomes
Nadia Nedjah
Multimedia Tools and Applications, 2022, 81 : 27111 - 27136

← 1 2 3 4 5 →