Hate and offensive speech detection on Arabic social media

被引:41
|
作者
Alsafari S. [1 ,2 ]
Sadaoui S. [1 ]
Mouhoub M. [1 ]
机构
[1] University of Regina, Regina
[2] University of Jeddah, Jeddah
来源
关键词
Arabic corpus; Data annotation; Data extraction; Deep learning; Feature extraction; Hate speech; Multi-class classification; Social media;
D O I
10.1016/j.osnem.2020.100096
中图分类号
学科分类号
摘要
We are witnessing an increasing proliferation of hate speech on social media targeting individuals for their protected characteristics. Our study aims to devise an effective Arabic hate and offensive speech detection framework to address this serious issue. First, we built a reliable Arabic textual corpus by crawling data from Twitter using four robust extraction strategies that we implement based on four types of hate: religion, ethnicity, nationality, and gender. Next, we label the corpus based on a three-hierarchical annotation scheme in which we verify the inter annotation agreement to ensure ground truth at each level. Based on machine and deep learning techniques, we develop numerous two-class, three-class, and six-class classification models that we combine with a variety of feature extraction techniques, such as contextual word embeddings. Finally, we conduct an intensive experiment to assess the performance of the different learned models and to examine the misclassification errors. The performance results are very encouraging compared to prior hate and offensive speech studies carried out on Arabic and other languages. © 2020 Elsevier B.V.
引用
收藏
相关论文
共 50 条
  • [1] A comprehensive review on Arabic offensive language and hate speech detection on social media: methods, challenges and solutions
    Abdelsamie, Mahmoud Mohamed
    Azab, Shahira Shaaban
    Hefny, Hesham A.
    SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [3] Detection of Hate and Offensive Speech in Text
    Wani, Abid Hussain
    Molvi, Nahida Shafi
    Ashraf, Sheikh Ishrah
    INTELLIGENT HUMAN COMPUTER INTERACTION (IHCI 2019), 2020, 11886 : 87 - 93
  • [4] Detection of Offensive Messages in Arabic Social Media Communications
    Mouheb, Djedjiga
    Ismail, Rutana
    Al Qaraghuli, Shaheen
    Al Aghbari, Zaher
    Kamel, Ibrahim
    PROCEEDINGS OF THE 2018 13TH INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY (IIT), 2018, : 24 - 29
  • [5] An efficient approach for data-imbalanced hate speech detection in Arabic social media
    Mohamed, Mohamed S.
    Elzayady, Hossam
    Badran, Khaled M.
    Salama, Gouda I.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (04) : 6381 - 6390
  • [6] Emojis as anchors to detect Arabic offensive language and hate speech
    Mubarak, Hamdy
    Hassan, Sabit
    Chowdhury, Shammur Absar
    NATURAL LANGUAGE ENGINEERING, 2023, 29 (06) : 1436 - 1457
  • [7] Automatic Hate and Offensive speech detection framework from social media : the case of Afaan Oromoo language
    Kanessa, Lata Guta
    Tulu, Solomon Gizaw
    2021 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY FOR DEVELOPMENT FOR AFRICA (ICT4DA), 2021, : 42 - 47
  • [8] A transfer learning approach for detecting offensive and hate speech on social media platforms
    Ishaani Priyadarshini
    Sandipan Sahu
    Raghvendra Kumar
    Multimedia Tools and Applications, 2023, 82 : 27473 - 27499
  • [9] ABMM: Arabic BERT-Mini Model for Hate-Speech Detection on Social Media
    Almaliki, Malik
    Almars, Abdulqader M.
    Gad, Ibrahim
    Atlam, El-Sayed
    ELECTRONICS, 2023, 12 (04)
  • [10] A transfer learning approach for detecting offensive and hate speech on social media platforms
    Priyadarshini, Ishaani
    Sahu, Sandipan
    Kumar, Raghvendra
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (18) : 27473 - 27499