The Limits of Abstract Evaluation Metrics: The Case of Hate Speech Detection

被引:14
|
作者
Olteanu, Alexandra [1 ]
Talamadupula, Kartik [1 ]
Varshney, Kush R. [1 ]
机构
[1] IBM Res, Armonk, NY 10504 USA
关键词
Evaluation metrics; hate speech; human-centered metrics;
D O I
10.1145/3091478.3098871
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Wagstaff (2012) draws attention to the pervasiveness of abstract evaluation metrics that explicitly ignore or remove problem specifics. While such metrics allow practitioners to compare numbers across application domains, they offer limited insight into the impact of algorithmic decisions on humans and their perception of the algorithm's correctness. Even for problems that are mathematically the same, both the real-cost of (mathematically) identical errors, as well as their perceived-cost by users, may significantly vary according to the specifics of each problem domain, as well as of the user perceiving the result. While the real-cost of errors has been considered previously, little attention has been paid to the perceived-cost issue. We advocate for the inclusion of human-centered metrics that elicit error costs from humans from two perspectives: the nature of the error, and the user context. Focusing on hate speech detection on social media, we demonstrate that even when fixing the performance as measured by an abstract metric such as precision, user perception of correctness varies greatly depending on the nature of errors and user characteristics.
引用
收藏
页码:405 / 406
页数:2
相关论文
共 50 条
  • [41] A comparison of classification algorithms for hate speech detection
    Putri, T. T. A.
    Sriadhi, S.
    Sari, R. D.
    Rahmadani, R.
    Hutahaean, H. D.
    INTERNATIONAL CONFERENCE ON INNOVATION IN ENGINEERING AND VOCATIONAL EDUCATION 2019 (ICIEVE 2019), PTS 1-4, 2020, 830
  • [42] A survey of hate speech detection in Indian languages
    Nandi, Arpan
    Sarkar, Kamal
    Mallick, Arjun
    De, Arkadeep
    SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [43] A Turkish Hate Speech Dataset and Detection System
    Beyhan, Fatih
    Carik, Buse
    Arin, Inanc
    Terzioglu, Aysecan
    Yanikoglu, Berrin
    Yeniterzi, Reyyan
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 4177 - 4185
  • [44] Impact of Data Augmentation on Hate Speech Detection
    Batarfi, Hanan A.
    Alsaedi, Olaa A.
    Wali, Arwa M.
    Jamal, Amani T.
    INNOVATIONS FOR COMMUNITY SERVICES, I4CS 2023, 2023, 1876 : 187 - 199
  • [45] Multiclass hate speech detection with an aggregated dataset
    Walsh, Sinead
    Greaney, Paul
    NATURAL LANGUAGE PROCESSING, 2025,
  • [46] Spanish hate-speech detection in football
    Montesinos-Canovas, Esteban
    Garcia-Sanchez, Francisco
    Antonio Garcia-Diaz, Jose
    Alcaraz-Marmol, Gema
    Valencia-Garcia, Rafael
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2023, (71): : 15 - 27
  • [47] The effect of gender bias on hate speech detection
    Sahinuc, Furkan
    Yilmaz, Eyup Halit
    Toraman, Cagri
    Koc, Aykut
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 1591 - 1597
  • [48] A Survey on Automatic Detection of Hate Speech in Text
    Fortuna, Paula
    Nunes, Sergio
    ACM COMPUTING SURVEYS, 2018, 51 (04)
  • [49] Deep Learning Ensembles for Hate Speech Detection
    Alsafari, Safa
    Sadaoui, Samira
    Mouhoub, Malek
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 526 - 531
  • [50] Enhancing hate speech detection with user characteristics
    Raut, Rohan
    Spezzano, Francesca
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (04) : 445 - 455