The Limits of Abstract Evaluation Metrics: The Case of Hate Speech Detection

被引:14
|
作者
Olteanu, Alexandra [1 ]
Talamadupula, Kartik [1 ]
Varshney, Kush R. [1 ]
机构
[1] IBM Res, Armonk, NY 10504 USA
关键词
Evaluation metrics; hate speech; human-centered metrics;
D O I
10.1145/3091478.3098871
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Wagstaff (2012) draws attention to the pervasiveness of abstract evaluation metrics that explicitly ignore or remove problem specifics. While such metrics allow practitioners to compare numbers across application domains, they offer limited insight into the impact of algorithmic decisions on humans and their perception of the algorithm's correctness. Even for problems that are mathematically the same, both the real-cost of (mathematically) identical errors, as well as their perceived-cost by users, may significantly vary according to the specifics of each problem domain, as well as of the user perceiving the result. While the real-cost of errors has been considered previously, little attention has been paid to the perceived-cost issue. We advocate for the inclusion of human-centered metrics that elicit error costs from humans from two perspectives: the nature of the error, and the user context. Focusing on hate speech detection on social media, we demonstrate that even when fixing the performance as measured by an abstract metric such as precision, user perception of correctness varies greatly depending on the nature of errors and user characteristics.
引用
收藏
页码:405 / 406
页数:2
相关论文
共 50 条
  • [31] Necessity and Sufficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection
    Balkir, Esma
    Nejadgholi, Isar
    Fraser, Kathleen C.
    Kiritchenko, Svetlana
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2672 - 2686
  • [32] Hate speech detection: A solved problem? The challenging case of long tail on Twitter
    Zhang, Ziqi
    Luo, Lei
    SEMANTIC WEB, 2019, 10 (05) : 925 - 945
  • [33] Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection
    Sen, Indira
    Samory, Mattia
    Wagner, Claudia
    Augenstein, Isabelle
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4716 - 4726
  • [34] LGBTphobia Experiences in Girona. Challenges and Limits on the Hate Crime and Hate Speech Policies in Catalonia
    Adiego, Jose Antonio Langarita
    Carbo, Pilar Albertin
    Balcells, Nuria Sadurni
    DROIT ET CULTURES, 2019, 77 : 37 - 51
  • [35] Detection of political hate speech in Korean language
    Ryu, Hyo-sun
    Lee, Jae Kook
    LANGUAGE RESOURCES AND EVALUATION, 2024,
  • [36] Explainable hate speech detection using LIME
    Joan L. Imbwaga
    Nagaratna B. Chittaragi
    Shashidhar G. Koolagudi
    International Journal of Speech Technology, 2024, 27 (3) : 793 - 815
  • [37] The effect of gender bias on hate speech detection
    Furkan Şahinuç
    Eyup Halit Yilmaz
    Cagri Toraman
    Aykut Koç
    Signal, Image and Video Processing, 2023, 17 : 1591 - 1597
  • [38] The Risk of Racial Bias in Hate Speech Detection
    Sap, Maarten
    Card, Dallas
    Gabriel, Saadia
    Choi, Yejin
    Smith, Noah A.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1668 - 1678
  • [39] Hate Speech Detection Using Brazilian Imageboards
    Nascimento, Gabriel
    Carvalho, Flavio
    da Cunha, Alexandre Martins
    Viana, Carlos Roberto
    Guedes, Gustavo Paiva
    WEBMEDIA 2019: PROCEEDINGS OF THE 25TH BRAZILLIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2019, : 325 - 328
  • [40] Is hate speech detection the solution the world wants?
    Parker, Sara
    Ruths, Derek
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (10)