Quantification of Automatic Speech Recognition System Performance on d/Deaf and Hard of Hearing Speech

被引:0
|
作者
Zhao, Robin [1 ]
Choi, Anna S. G. [2 ]
Koenecke, Allison [2 ]
Rameau, Anais [1 ]
机构
[1] Weill Cornell Med Coll, Sean Parker Inst Voice, New York, NY USA
[2] Cornell Univ, Dept Informat Sci, Ithaca, NY USA
来源
LARYNGOSCOPE | 2025年 / 135卷 / 01期
关键词
artificial intelligence; voice; DEAF SPEECH; INTELLIGIBILITY; CHILDREN; PERCEPTION; SKILLS;
D O I
10.1002/lary.31713
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
ObjectiveTo evaluate the performance of commercial automatic speech recognition (ASR) systems on d/Deaf and hard-of-hearing (d/Dhh) speech.MethodsA corpus containing 850 audio files of d/Dhh and normal hearing (NH) speech from the University of Memphis Speech Perception Assessment Laboratory was tested on four speech-to-text application program interfaces (APIs): Amazon Web Services, Microsoft Azure, Google Chirp, and OpenAI Whisper. We quantified the Word Error Rate (WER) of API transcriptions for 24 d/Dhh and nine NH participants and performed subgroup analysis by speech intelligibility classification (SIC), hearing loss (HL) onset, and primary communication mode.ResultsMean WER averaged across APIs was 10 times higher for the d/Dhh group (52.6%) than the NH group (5.0%). APIs performed significantly worse for "low" and "medium" SIC (85.9% and 46.6% WER, respectively) as compared to "high" SIC group (9.5% WER, comparable to NH group). APIs performed significantly worse for speakers with prelingual HL relative to postlingual HL (80.5% and 37.1% WER, respectively). APIs performed significantly worse for speakers primarily communicating with sign language (70.2% WER) relative to speakers with both oral and sign language communication (51.5%) or oral communication only (19.7%).ConclusionCommercial ASR systems underperform for d/Dhh individuals, especially those with "low" and "medium" SIC, prelingual onset of HL, and sign language as primary communication mode. This contrasts with Big Tech companies' promises of accessibility, indicating the need for ASR systems ethically trained on heterogeneous d/Dhh speech data.Level of Evidence3 Laryngoscope, 2024 Commercial automatic speech recognition (ASR) systems underperform for d/Deaf and hard-of-hearing (d/Dhh) individuals, especially those with "low" and "medium" speech intelligibility classification, prelingual onset of hearing loss, and sign language as primary communication mode. There is a need for ASR systems ethically trained on heterogeneous d/Dhh speech data.image
引用
收藏
页码:191 / 197
页数:7
相关论文
共 50 条
  • [21] Improving Speech Recognition for Japanese Deaf and Hard-of-Hearing People by Replacing Encoder Layers
    Takahashi, Kaito
    Wakabayashi, Yukoh
    Ohta, Kengo
    Kobayashi, Akio
    Kitaoka, Norihide
    2024 11th International Conference on Advanced Informatics: Concept, Theory and Application, ICAICTA 2024, 2024,
  • [22] THE INFLUENCE OF AUTOMATIC SPEECH RECOGNITION ACCURACY ON THE PERFORMANCE OF AN AUTOMATED SPEECH ASSESSMENT SYSTEM
    Tao, Jidong
    Evanini, Keelan
    Wang, Xinhao
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 294 - 299
  • [23] Development of an Online Subjective Evaluation System for Recorded Speech of Deaf and Hard of Hearing Children
    Varga, Attila K.
    Czap, Laszlo
    2015 6TH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM), 2015, : 455 - 458
  • [24] AUTOMATIC SPEECH RECOGNITION SYSTEM
    RUSKE, G
    UMSCHAU IN WISSENSCHAFT UND TECHNIK, 1979, 79 (18) : 566 - 572
  • [25] Automatic Speech Recognition Performance for Training on Noised Speech
    Prodeus, Arkadiy
    Kukharicheva, Kateryna
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION AND COMMUNICATION TECHNOLOGIES-2017 (AICT 2017), 2017, : 71 - 74
  • [26] Training of Automatic Speech Recognition System on Noised Speech
    Prodeus, Arkadiy
    Kukharicheva, Kateryna
    2016 4TH INTERNATIONAL CONFERENCE ON METHODS AND SYSTEMS OF NAVIGATION AND MOTION CONTROL (MSNMC), 2016, : 221 - 223
  • [27] Speech For the Hard of Hearing
    New, Mary C.
    VOLTA REVIEW, 1945, 47 (05) : 282 - 284
  • [29] A New Audio-Visual Aid for Speech For the Deaf and the Hard of Hearing
    Cavanagh, Anita
    VOLTA REVIEW, 1951, 53 (01) : 12 - +
  • [30] Visualizing speech styles in captions for deaf and hard-of-hearing viewers
    Ahn, Sooyeon
    Kim, Jooyeong
    Shin, Choonsung
    Hong, Jin-Hyuk
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2025, 194