Two Protocols Comparing Human and Machine Phonetic Recognition Performance in Conversational Speech

被引:0
|
作者
Shen, Wade [1 ]
Olive, Joseph [2 ]
Jones, Douglas [1 ]
机构
[1] MIT, Lincoln Lab, 244 Wood St, Lexington, MA 02420 USA
[2] Def Adv Res Projects Agcy, Arlington, VA 22203 USA
关键词
phonetic discrimination; speech perception; phonetic recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes two experimental protocols for direct comparison of human and machine phonetic discrimination performance in continuous speech. These protocols attempt to isolate phonetic discrimination while eliminating for language and segmentation biases. Results of two human experiments are described including comparisons with automatic phonetic recognition baselines. Our experiments suggest that in conversational telephone speech, human performance on these tasks exceeds that of machines by 15%. Furthermore, in a related controlled language model experiment, human subjects were better able to correctly predict words in conversational speech by 45%.(*)
引用
收藏
页码:1630 / +
页数:2
相关论文
共 50 条
  • [1] Comparing Human and Machine Errors in Conversational Speech Transcription
    Stolcke, Andreas
    Droppo, Jasha
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 137 - 141
  • [2] Comparing human and machine speech recognition in noise with QuickSIN
    Slaney, Malcolm
    Fitzgerald, Matthew B.
    JASA EXPRESS LETTERS, 2024, 4 (09):
  • [3] Recognition of Interest in Human Conversational Speech
    Schuller, Bjoern
    Koehler, Niels
    Mueller, Ronald
    Rigoll, Gerhard
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 793 - 796
  • [4] Random-Forests-based phonetic decision trees for conversational speech recognition
    Xue, Jian
    Zhao, Yunxin
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4169 - 4172
  • [5] Random forests of phonetic decision trees for acoustic modeling in conversational speech recognition
    Xue, Jian
    Zhao, Yunxin
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (03): : 519 - 528
  • [6] Improving a Conversational Speech Recognition System Using Phonetic and Neural Transcript Correction
    Campos-Soberanis, Mario
    Campos-Sobrino, Diego
    Viana-Camara, Rafael
    ADVANCES IN SOFT COMPUTING (MICAI 2021), PT II, 2021, 13068 : 46 - 58
  • [7] A BAYESIAN APPROACH FOR PHONETIC DECISION TREE STATE TYING IN CONVERSATIONAL SPEECH RECOGNITION
    Hu, Rusheng
    Zhao, Yunxin
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 661 - +
  • [8] Improving a Conversational Speech Recognition System Using Phonetic and Neural Transcript Correction
    Campos-Soberanis, Mario
    Campos-Sobrino, Diego
    Viana-Cámara, Rafael
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2021, 13068 LNAI : 46 - 58
  • [9] Toward Human Parity in Conversational Speech Recognition
    Xiong, Wayne
    Droppo, Jasha
    Huang, Xuedong
    Seide, Frank
    Seltzer, Michael L.
    Stolcke, Andreas
    Yu, Dong
    Zweig, Geoffrey
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (12) : 2410 - 2423
  • [10] The Impact of Inaccurate Phonetic Annotations on Speech Recognition Performance
    Safarik, Radek
    Mateju, Lukas
    TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 402 - 410