Quantification of Automatic Speech Recognition System Performance on d/Deaf and Hard of Hearing Speech

被引:0
|
作者
Zhao, Robin [1 ]
Choi, Anna S. G. [2 ]
Koenecke, Allison [2 ]
Rameau, Anais [1 ]
机构
[1] Weill Cornell Med Coll, Sean Parker Inst Voice, New York, NY USA
[2] Cornell Univ, Dept Informat Sci, Ithaca, NY USA
来源
LARYNGOSCOPE | 2025年 / 135卷 / 01期
关键词
artificial intelligence; voice; DEAF SPEECH; INTELLIGIBILITY; CHILDREN; PERCEPTION; SKILLS;
D O I
10.1002/lary.31713
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
ObjectiveTo evaluate the performance of commercial automatic speech recognition (ASR) systems on d/Deaf and hard-of-hearing (d/Dhh) speech.MethodsA corpus containing 850 audio files of d/Dhh and normal hearing (NH) speech from the University of Memphis Speech Perception Assessment Laboratory was tested on four speech-to-text application program interfaces (APIs): Amazon Web Services, Microsoft Azure, Google Chirp, and OpenAI Whisper. We quantified the Word Error Rate (WER) of API transcriptions for 24 d/Dhh and nine NH participants and performed subgroup analysis by speech intelligibility classification (SIC), hearing loss (HL) onset, and primary communication mode.ResultsMean WER averaged across APIs was 10 times higher for the d/Dhh group (52.6%) than the NH group (5.0%). APIs performed significantly worse for "low" and "medium" SIC (85.9% and 46.6% WER, respectively) as compared to "high" SIC group (9.5% WER, comparable to NH group). APIs performed significantly worse for speakers with prelingual HL relative to postlingual HL (80.5% and 37.1% WER, respectively). APIs performed significantly worse for speakers primarily communicating with sign language (70.2% WER) relative to speakers with both oral and sign language communication (51.5%) or oral communication only (19.7%).ConclusionCommercial ASR systems underperform for d/Dhh individuals, especially those with "low" and "medium" SIC, prelingual onset of HL, and sign language as primary communication mode. This contrasts with Big Tech companies' promises of accessibility, indicating the need for ASR systems ethically trained on heterogeneous d/Dhh speech data.Level of Evidence3 Laryngoscope, 2024 Commercial automatic speech recognition (ASR) systems underperform for d/Deaf and hard-of-hearing (d/Dhh) individuals, especially those with "low" and "medium" speech intelligibility classification, prelingual onset of hearing loss, and sign language as primary communication mode. There is a need for ASR systems ethically trained on heterogeneous d/Dhh speech data.image
引用
收藏
页码:191 / 197
页数:7
相关论文
共 50 条
  • [41] AUTOMATIC SPEECH RECOGNITION TO AID THE HEARING-IMPAIRED - PROSPECTS FOR THE AUTOMATIC-GENERATION OF CUED SPEECH
    UCHANSKI, RM
    DELHORNE, LA
    DIX, AK
    BRAIDA, LD
    REED, CM
    DURLACH, NI
    JOURNAL OF REHABILITATION RESEARCH AND DEVELOPMENT, 1994, 31 (01): : 20 - 41
  • [42] Preliminary Evaluation of Automated Speech Recognition Apps for the Hearing Impaired and Deaf
    Pragt, Leontien
    van Hengel, Peter
    Grob, Dagmar
    Wasmann, Jan-Willem A.
    FRONTIERS IN DIGITAL HEALTH, 2022, 4
  • [43] Alphabet model-based short vocabulary speech recognition for the assessment of profoundly deaf and hard of hearing speeches
    Jeyalakshmi, C.
    Revathi, A.
    Krishnamurthi, V.
    INTERNATIONAL JOURNAL OF MODELLING IDENTIFICATION AND CONTROL, 2015, 23 (03) : 278 - 286
  • [44] Improving Accessibility of Lectures for Deaf and Hard-of-Hearing Students Using a Speech Recognition System and a Real-Time Collaborative Editor
    Lathiere, Benoit
    Archambault, Dominique
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, ICCHP 2014, PT II, 2014, 8548 : 490 - 497
  • [45] Speech production and automatic speech recognition
    Acoustics Bulletin, 2000, 25 (02):
  • [46] Tests for the hearing of speech by deaf people
    Fry, DB
    Kerridge, PMT
    LANCET, 1939, 1 : 106 - 109
  • [47] AUTOMATIC SPEECH RECOGNITION OF IMPAIRED SPEECH
    CARLSON, GS
    BERNSTEIN, J
    INTERNATIONAL JOURNAL OF REHABILITATION RESEARCH, 1988, 11 (04) : 396 - 398
  • [48] NINETY-FIVE THESES ON SEEING SPEECH FOR THE DEAF, THE DEAFENED, AND THE HARD OF HEARING
    Schumann, Paul
    VOLTA REVIEW, 1921, 23 (05) : 246 - 250
  • [49] Measuring speech intelligibility with deaf and hard-of-hearing children: A systematic review
    Stefansdottir, Harpa
    Crowe, Kathryn
    Magnusson, Egill
    Guiberson, Mark
    Masdottir, Thora
    Agustsdottir, Inga
    Baldursdottir, Osp, V
    JOURNAL OF DEAF STUDIES AND DEAF EDUCATION, 2024, 29 (02): : 265 - 277
  • [50] Considering Multi-Modal Speech Visualization for Deaf and Hard of Hearing People
    Toba, Yusuke
    Horiuchi, Hiroyasu
    Matsumoto, Shinsuke
    Saiki, Sachio
    Nakamura, Masahide
    Uchino, Tomohito
    Yokoyama, Tomohiro
    Takebayashi, Yasuhiro
    2015 10th Asia-Pacific Symposium on Information and Telecommunication Technologies (APSITT), 2015,