Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster

被引:0
|
作者
Calabrese, Agostina [1 ,2 ]
Neves, Leonardo [1 ]
Shah, Neil [2 ]
Bos, Maarten W. [1 ]
Ross, Bjorn [2 ]
Lapata, Mirella [1 ]
Barbieri, Francesco [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
[2] Snap Inc, Santa Monica, CA USA
基金
欧洲研究理事会; 英国工程与自然科学研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Content moderators play a key role in keeping the conversation on social media healthy. While the high volume of content they need to judge represents a bottleneck to the moderation pipeline, no studies have explored how models could support them to make faster decisions. There is, by now, a vast body of research into detecting hate speech, sometimes explicitly motivated by a desire to help improve content moderation, but published research using real content moderators is scarce. In this work we investigate the effect of explanations on the speed of real-world moderators. Our experiments show that while generic explanations do not affect their speed and are often ignored, structured explanations lower moderators' decision making time by 7.4%.
引用
收藏
页码:398 / 408
页数:11
相关论文
共 50 条
  • [21] Lifelong Learning of Hate Speech Classification on Social Media
    Qian, Jing
    Wang, Hong
    ElSherief, Mai
    Yan, Xifeng
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2304 - 2314
  • [22] Hate Speech Detection in Social Media for the Kurdish Language
    Saeed, Ari M.
    Ismael, Aso N.
    Rasul, Danya L.
    Majeed, Rayan S.
    Rashid, Tarik A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INNOVATIONS IN COMPUTING RESEARCH (ICR'22), 2022, 1431 : 253 - 260
  • [23] The eradication of hate speech on social media: a systematic review
    Gracia-Calandin, Javier
    Suarez-Montoya, Leonardo
    JOURNAL OF INFORMATION COMMUNICATION & ETHICS IN SOCIETY, 2023, 21 (04): : 406 - 421
  • [24] Time of Your Hate: The Challenge of Time in Hate Speech Detection on Social Media
    Florio, Komal
    Basile, Valerio
    Polignano, Marco
    Basile, Pierpaolo
    Patti, Viviana
    APPLIED SCIENCES-BASEL, 2020, 10 (12):
  • [25] Transfer learning for hate speech detection in social media
    Lanqin Yuan
    Tianyu Wang
    Gabriela Ferraro
    Hanna Suominen
    Marian-Andrei Rizoiu
    Journal of Computational Social Science, 2023, 6 : 1081 - 1101
  • [26] HATE SPEECH ON SOCIAL MEDIA: FREEDOM OF EXPRESSION AT A CROSSROADS
    Bueso, Laura Diez
    REVISTA CATALANA DE DRET PUBLIC, 2020, (61): : 50 - 64
  • [27] Towards Conceptualization of "Fair Explanation": Disparate Impacts of anti-Asian Hate Speech Explanations on Content Moderators
    Nguyen, Tin
    Xu, Jiannan
    Roy, Aayushi
    Daume, Hal, III
    Carpuat, Marine
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 9696 - 9717
  • [28] Detecting Hate Speech on Social Media with Respect to Adolescent Vulnerability
    Chiu, Anna
    Sood, Kanika
    Rincon, Ariadne
    Doran, Davina
    2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC, 2023, : 724 - 728
  • [29] Racism, Hate Speech, and Social Media: A Systematic Review and Critique
    Matamoros-Fernandez, Ariadna
    Farkas, Johan
    TELEVISION & NEW MEDIA, 2021, 22 (02) : 205 - 224
  • [30] A curated dataset for hate speech detection on social media text
    Mody, Devansh
    Huang, YiDong
    de Oliveira, Thiago Eustaquio Alves
    DATA IN BRIEF, 2023, 46