Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster

被引：0

作者：

Calabrese, Agostina ^{[1
,2
]}

Neves, Leonardo ^{[1
]}

Shah, Neil ^{[2
]}

Bos, Maarten W. ^{[1
]}

Ross, Bjorn ^{[2
]}

Lapata, Mirella ^{[1
]}

Barbieri, Francesco ^{[1
]}

机构：

[1] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland

[2] Snap Inc, Santa Monica, CA USA

来源：

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS | 2024年

基金：

欧洲研究理事会; 英国工程与自然科学研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Content moderators play a key role in keeping the conversation on social media healthy. While the high volume of content they need to judge represents a bottleneck to the moderation pipeline, no studies have explored how models could support them to make faster decisions. There is, by now, a vast body of research into detecting hate speech, sometimes explicitly motivated by a desire to help improve content moderation, but published research using real content moderators is scarce. In this work we investigate the effect of explanations on the speed of real-world moderators. Our experiments show that while generic explanations do not affect their speed and are often ignored, structured explanations lower moderators' decision making time by 7.4%.

引用

页码：398 / 408

页数：11

共 50 条

[31] Moral Values in Social Media for Disinformation and Hate Speech Analysis
Brugnoli, Emanuele
Gravino, Pietro
Prevedello, Giulio
VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, 2024, 14520 : 67 - 82
[32] Perspectives of Canadian Youth on Islamophobic Hate Speech on Social Media
Arshad-Ayaz, Adeela
Naseem, Muhammad Ayaz
Hizoui, Hedia
Akram, Muhammad
CANADIAN JOURNAL OF COMMUNICATION, 2024, 49 (04) : 586 - 611
[33] Hate speech classification in social media using emotional analysis
Martins, Ricardo
Gomes, Marco
Almeida, Jose Joao
Novais, Paulo
Henriques, Pedro
2018 7TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2018, : 61 - 66
[34] Hate speech on social media networks: towards a regulatory framework?
Alkiviadou, Natalie
INFORMATION & COMMUNICATIONS TECHNOLOGY LAW, 2019, 28 (01) : 19 - 35
[35] Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces
Hahn, Vanessa
Ruiter, Dana
Kleinbauer, Thomas
Klakow, Dietrich
WOAH 2021: THE 5TH WORKSHOP ON ONLINE ABUSE AND HARMS, 2021, : 6 - 16
[36] Free vs hate speech on social media: the Indian perspective
Alam, Iftikhar
Raina, Roshan Lal
Siddiqui, Faizia
JOURNAL OF INFORMATION COMMUNICATION & ETHICS IN SOCIETY, 2016, 14 (04): : 350 - 363
[37] Detecting Hate Speech in Social Media Articles in Romanized Sinhala
Hettiarachchi, Nimali
Weerasinghe, Ruvan
Pushpanda, Randil
2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 250 - 255
[38] Detecting weak and strong Islamophobic hate speech on social media
Vidgen, Bertie
Yasseri, Taha
JOURNAL OF INFORMATION TECHNOLOGY & POLITICS, 2020, 17 (01) : 66 - 78
[39] Automatic Hate Speech Detection on Social Media: A Brief Survey
Alrehili, Ahlam
2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,
[40] SIREN! Detecting Burmese Hate Speech Comments on Social Media
Chit, Khin Me Me
Shein, Yi Yi Chan Myae Win
Yan, Wai
Khine, Aye Hninn
2022-14TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST 2022), 2022, : 119 - 124

← 1 2 3 4 5 →