Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster

被引:0
|
作者
Calabrese, Agostina [1 ,2 ]
Neves, Leonardo [1 ]
Shah, Neil [2 ]
Bos, Maarten W. [1 ]
Ross, Bjorn [2 ]
Lapata, Mirella [1 ]
Barbieri, Francesco [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
[2] Snap Inc, Santa Monica, CA USA
基金
欧洲研究理事会; 英国工程与自然科学研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Content moderators play a key role in keeping the conversation on social media healthy. While the high volume of content they need to judge represents a bottleneck to the moderation pipeline, no studies have explored how models could support them to make faster decisions. There is, by now, a vast body of research into detecting hate speech, sometimes explicitly motivated by a desire to help improve content moderation, but published research using real content moderators is scarce. In this work we investigate the effect of explanations on the speed of real-world moderators. Our experiments show that while generic explanations do not affect their speed and are often ignored, structured explanations lower moderators' decision making time by 7.4%.
引用
收藏
页码:398 / 408
页数:11
相关论文
共 50 条
  • [31] Moral Values in Social Media for Disinformation and Hate Speech Analysis
    Brugnoli, Emanuele
    Gravino, Pietro
    Prevedello, Giulio
    VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, 2024, 14520 : 67 - 82
  • [32] Perspectives of Canadian Youth on Islamophobic Hate Speech on Social Media
    Arshad-Ayaz, Adeela
    Naseem, Muhammad Ayaz
    Hizoui, Hedia
    Akram, Muhammad
    CANADIAN JOURNAL OF COMMUNICATION, 2024, 49 (04) : 586 - 611
  • [33] Hate speech classification in social media using emotional analysis
    Martins, Ricardo
    Gomes, Marco
    Almeida, Jose Joao
    Novais, Paulo
    Henriques, Pedro
    2018 7TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2018, : 61 - 66
  • [35] Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces
    Hahn, Vanessa
    Ruiter, Dana
    Kleinbauer, Thomas
    Klakow, Dietrich
    WOAH 2021: THE 5TH WORKSHOP ON ONLINE ABUSE AND HARMS, 2021, : 6 - 16
  • [36] Free vs hate speech on social media: the Indian perspective
    Alam, Iftikhar
    Raina, Roshan Lal
    Siddiqui, Faizia
    JOURNAL OF INFORMATION COMMUNICATION & ETHICS IN SOCIETY, 2016, 14 (04): : 350 - 363
  • [37] Detecting Hate Speech in Social Media Articles in Romanized Sinhala
    Hettiarachchi, Nimali
    Weerasinghe, Ruvan
    Pushpanda, Randil
    2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 250 - 255
  • [38] Detecting weak and strong Islamophobic hate speech on social media
    Vidgen, Bertie
    Yasseri, Taha
    JOURNAL OF INFORMATION TECHNOLOGY & POLITICS, 2020, 17 (01) : 66 - 78
  • [39] Automatic Hate Speech Detection on Social Media: A Brief Survey
    Alrehili, Ahlam
    2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,
  • [40] SIREN! Detecting Burmese Hate Speech Comments on Social Media
    Chit, Khin Me Me
    Shein, Yi Yi Chan Myae Win
    Yan, Wai
    Khine, Aye Hninn
    2022-14TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST 2022), 2022, : 119 - 124