Detecting Arabic Offensive Language in Microblogs Using Domain-Specific Word Embeddings and Deep Learning

被引:3
|
作者
Aljuhani, Khulood O. [1 ]
Alyoubi, Khaled H. [1 ]
Alotaibi, Fahd S. [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Informat Syst Dept, Jeddah, Saudi Arabia
来源
TEHNICKI GLASNIK-TECHNICAL JOURNAL | 2022年 / 16卷 / 03期
关键词
Arabic Natural Language Processing; Arabic Tweets; Offensive Language Detection; Offensive Language; Word Embeddings;
D O I
10.31803/tg-20220305120018
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In recent years, social media networks are emerging as a key player by providing platforms for opinions expression, communication, and content distribution. However, users often take advantage of perceived anonymity on social media platforms to share offensive or hateful content. Thus, offensive language has grown as a significant issue with the increase in online communication and the popularity of social media platforms. This problem has attracted significant attention for devising methods for detecting offensive content and preventing its spread on online social networks. Therefore, this paper aims to develop an effective Arabic offensive language detection model by employing deep learning and semantic and contextual features. This paper proposes a deep learning approach that utilizes the bidirectional long short-term memory (BiLSTM) model and domain-specific word embeddings extracted from an Arabic offensive dataset. The detection approach was evaluated on an Arabic dataset collected from Twitter. The results showed the highest performance accuracy of 0.93% with the BiLSTM model trained using a combination of domain-specific and agnostic-domain word embeddings.
引用
收藏
页码:394 / 400
页数:7
相关论文
共 50 条
  • [1] Lifelong Learning of Topics and Domain-Specific Word Embeddings
    Qin, Xiaorui
    Lu, Yuyin
    Chen, Yufu
    Rao, Yanghui
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2294 - 2309
  • [2] Evaluation of Domain-specific Word Embeddings using Knowledge Resources
    Nooralahzadeh, Farhad
    Ovrelid, Lilja
    Lonning, Jan Tore
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1438 - 1445
  • [3] Expansion of domain-specific opinion lexicons using word embeddings
    Lopez Solaz, Tomas
    Cruz, Fermin L.
    Enriquez, Fernando
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2016, (57): : 49 - 56
  • [4] Domain-specific word embeddings for patent classification
    Risch, Julian
    Krestel, Ralf
    DATA TECHNOLOGIES AND APPLICATIONS, 2019, 53 (01) : 108 - 122
  • [5] Domain-Specific Word Embeddings with Structure Prediction
    Lassner, David
    Brandl, Stephanie
    Baillot, Anne
    Nakajima, Shinichi
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 320 - 335
  • [6] Detecting Domain-specific Ambiguities: an NLP Approach based on Wikipedia Crawling and Word Embeddings
    Ferrari, Alessio
    Donati, Beatrice
    Gnesi, Stefania
    2017 IEEE 25TH INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW), 2017, : 393 - 399
  • [7] Learning Domain-Specific Word Embeddings from COVID-19 Tweets
    Aigbe, Steve Aibuedefe
    Eick, Christoph
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4307 - 4312
  • [8] Arabic Quran Verses Authentication Using Deep Learning and Word Embeddings
    Touati-Hamad, Zineb
    Laouar, Mohamed Ridda
    Bendib, Issam
    Hakak, Saqib
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (04) : 681 - 688
  • [9] Offensive Language Detection of Arabic Tweets Using Deep Learning Algorithm
    AlSukhni, Emad
    AlAzzam, Iyad
    Hanandeh, Sereen
    2024 15TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS, ICICS 2024, 2024,
  • [10] Application-specific word embeddings for hate and offensive language detection
    Claver P. Soto
    Gustavo M. S. Nunes
    José Gabriel R. C. Gomes
    Nadia Nedjah
    Multimedia Tools and Applications, 2022, 81 : 27111 - 27136