Data Augmentation for Improving Explainability of Hate Speech Detection

被引:0
|
作者
Ansari, Gunjan [1 ]
Kaur, Parmeet [2 ]
Saxena, Chandni [3 ]
机构
[1] JSS Acad Tech Educ, Dept Informat Technol, Noida, India
[2] Jaypee Inst Informat Technol, Dept Comp Sci & Informat Technol, Noida, India
[3] Chinese Univ Hong Kong, SAR, Hong Kong, Peoples R China
关键词
Hate speech; Cyberbullying; Explainable AI; Data augmentation; LIME; Integrated gradient;
D O I
10.1007/s13369-023-08100-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The paper presents a novel data augmentation-based approach to develop explainable, deep learning models for hate speech detection. Hate speech is widely prevalent on online social media but difficult to detect automatically due to challenges of natural language processing and complexity of hate speech. Further, the decisions of the existing solutions possess constrained explainability since limited annotated data are available for training and testing of models. Therefore, this work proposes the use of text-based data augmentation for improving the performance and explainability of deep learning models. Techniques based on easy data augmentation, bidirectional encoder representations from transformers and back translation have been utilized for data augmentation. Convolutional neural networks and long short-term memory models are trained with augmented data and evaluated on two publicly available datasets for hate speech detection. Methods of LIME and integrated gradients are used to retrieve explanations of the deep learning models. A diagnostic study is conducted on test samples to check for improvement in the models as a result of the data augmentation. The experimental results verify that the proposed approach improves the explainability as well as the accuracy of hate speech detection.
引用
收藏
页码:3609 / 3621
页数:13
相关论文
共 50 条
  • [21] Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster
    Calabrese, Agostina
    Neves, Leonardo
    Shah, Neil
    Bos, Maarten W.
    Ross, Bjorn
    Lapata, Mirella
    Barbieri, Francesco
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 398 - 408
  • [22] Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection
    Bose, Tulika
    Aletras, Nikolaos
    Illina, Irina
    Fohr, Dominique
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 372 - 382
  • [23] Improving Turkish Telephone Speech Recognition with Data Augmentation and Out of Domain Data
    Uslu, Zeynep Gulhan
    Yildirim, Tulay
    2019 16TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2019, : 176 - 179
  • [24] Performance comparison of data balancing techniques on hate speech detection in Turkish
    Karayigit, Habibe
    Akdagli, Ali
    Aci, Cigdem
    PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2024, 30 (05): : 610 - 621
  • [25] A comprehensive review on detection of hate speech for multi-lingual data
    Narula, Rachna
    Chaudhary, Poonam
    SOCIAL NETWORK ANALYSIS AND MINING, 2025, 14 (01)
  • [26] Multilingual Hate Speech Detection: Innovations in Optimized Deep Learning for English and Arabic Hate Speech Detection
    Hassan AL-Sukhani
    Qusay Bsoul
    Abdelrahman H. Elhawary
    Ziad M. Nasr
    Ahmed E. Mansour
    Radwan M. Batyha
    Basma S. Alqadi
    Jehad Saad Alqurni
    Hayat Alfagham
    Magda M. Madbouly
    SN Computer Science, 6 (3)
  • [27] On Online Hate Speech Detection. Effects of Negated Data Construction
    Abderrouaf, Cheniki
    Oussalah, Mourad
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 5595 - 5602
  • [28] Data expansion using back translation and paraphrasing for hate speech detection
    Beddiar D.R.
    Jahan M.S.
    Oussalah M.
    Online Social Networks and Media, 2021, 24
  • [29] Bias Detection and Mitigation in Textual Data: A Study on Fake News and Hate Speech Detection
    Kasampalis, Apostolos
    Chatzakou, Despoina
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 374 - 383
  • [30] Improving Intrusion Detection Through Training Data Augmentation
    Otokwala, Uneneibotejit
    Petrovski, Andrei
    Kalutarage, Harsha
    2021 14TH INTERNATIONAL CONFERENCE ON SECURITY OF INFORMATION AND NETWORKS (SIN 2021), 2021,