Towards Comprehensive Subgroup Performance Analysis in Speech Models

被引:5
|
作者
Koudounas, Alkis [1 ]
Pastor, Eliana [1 ]
Attanasio, Giuseppe [2 ,3 ]
Mazzia, Vittorio [4 ]
Giollo, Manuel [4 ]
Gueudre, Thomas [4 ]
Reale, Elisa [4 ]
Cagliero, Luca [1 ]
Cumani, Sandro [1 ]
de Alfaro, Luca [5 ]
Baralis, Elena [1 ]
Amberti, Daniele [4 ]
机构
[1] Politecn Torino, I-10129 Turin, Italy
[2] Bocconi Inst Data Sci & Anal, MilaNLP Grp, I-20100 Milan, Italy
[3] Bocconi Inst Data Sci & Anal, Data & Mkt Insights Unit, I-20100 Milan, Italy
[4] Amazon, AGI, I-26015 Turin, Italy
[5] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
关键词
Speech representation; E2E-SLU models; subgroup identification; model bias analysis; divergence; SPEAKER VERIFICATION;
D O I
10.1109/TASLP.2024.3363447
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The evaluation of spoken language understanding (SLU) systems is often restricted to assessing their global performance or examining predefined subgroups of interest. However, a more detailed analysis at the subgroup level has the potential to uncover valuable insights into how speech system performance differs across various subgroups. In this work, we identify biased data subgroups and describe them at the level of user demographics, recording conditions, and speech targets. We propose a new task-, model- and dataset-agnostic approach to detect significant intra- and cross-model performance gaps. We detect problematic data subgroups in SLU models by leveraging the notion of subgroup divergence. We also compare the outcome of different SLU models on the same dataset and task at the subgroup level. We identify significant gaps in subgroup performance between models different in size, architecture, or pre-training objectives, including multi-lingual and mono-lingual models, yet comparable to each other in overall performance. The results, obtained on two SLU models, four datasets, and three different tasks-intent classification, automatic speech recognition, and emotion recognition-confirm the effectiveness of the proposed approach in providing a nuanced SLU model assessment.
引用
收藏
页码:1468 / 1480
页数:13
相关论文
共 50 条
  • [1] Subgroup mining for performance analysis of regression models
    Pimentel, Joao
    Azevedo, Paulo J.
    Torgo, Luis
    EXPERT SYSTEMS, 2023, 40 (01)
  • [2] Towards a comprehensive assessment of speech intelligibility for pathological speech
    Xue, W.
    Ramos, V. Mendoza
    Harmsen, W.
    Cucchiarini, C.
    van Hout, R. W. N. M.
    Strik, H.
    INTERSPEECH 2020, 2020, : 3146 - 3150
  • [3] Towards Application of Speech Analysis in Predicting Learners' Performance
    Chowdary Attota, Dinesh
    Dehbozorgi, Nasrin
    Proceedings - Frontiers in Education Conference, FIE, 2022, 2022-October
  • [4] Towards Application of Speech Analysis in Predicting Learners' Performance
    Attota, Dinesh Chowdary
    Dehbozorgi, Nasrin
    2022 IEEE FRONTIERS IN EDUCATION CONFERENCE, FIE, 2022,
  • [5] Progress towards speech models that model speech
    Russell, M
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 115 - 123
  • [6] Houston we have a Divergence: A Subgroup Performance Analysis of ASR Models
    Koudounas, Alkis
    Giobergia, Flavio
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 812 - 813
  • [7] A comprehensive analysis and performance evaluation for osteoporosis prediction models
    Alden, Zahraa Noor Aldeen M. Shams
    Ata, Oguz
    PEERJ COMPUTER SCIENCE, 2024, 10 : 1 - 28
  • [8] Bayesian Hierarchical Models for Subgroup Analysis
    Wang, Yun
    Tu, Wenda
    Koh, William
    Travis, James
    Abugov, Robert
    Hamilton, Kiya
    Zheng, Mengjie
    Crackel, Roberto
    Bonangelino, Pablo
    Rothmann, Mark
    PHARMACEUTICAL STATISTICS, 2024, 23 (06) : 1065 - 1083
  • [9] Bayesian Subgroup Analysis with Hierarchical Models
    Pennello, Gene
    Rothmann, Mark
    BIOPHARMACEUTICAL APPLIED STATISTICS SYMPOSIUM: BIOSTATISTICAL ANALYSIS OF CLINICAL TRIALS, VOL 2, 2018, : 175 - 192
  • [10] Towards better performance for Speech Enhancement
    Mergu, Rohini R.
    Dixit, Shantanu K.
    2015 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, SIGNALS, COMMUNICATION AND OPTIMIZATION (EESCO), 2015,