Diagnostics of speech recognition using classification phoneme diagnostic trees

被引:0
|
作者
Cernak, Milos [1 ]
Wellekens, Christian [1 ]
机构
[1] Inst Eurecom, Dept Multimedia Commun, 2229 Route Cretes,BP 193, F-06904 Sophia Antioplis, France
关键词
fault diagnosis; speech recognition; intrinsic speech variabilities;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
More than three decades of speech recognition research resulted in a very sophisticated statistical framework. However, less attention was still devoted to diagnostics of speech recognition; most previous research report on results in terms of ever-lower WER in various intrinsic or environmental conditions. This paper presents a diagnostics of the decoding process of ASR systems. The purpose of our diagnostics is to go beyond standard evaluation in terms of WERs and confusion matrices, and to look at the recognized output in more details. During the decoding phase, some specific data are collected at the decoder as possible causes of errors, and later are statistically analyzed using classification and regression trees. Focusing on pure acoustic phone decoding without language modeling, we present and discuss the results of the diagnostics that is used for an analysis of impact of intrinsic speech variabilities on speech recognition.
引用
收藏
页码:459 / +
页数:2
相关论文
共 50 条
  • [1] Hierarchical Phoneme Classification for Improved Speech Recognition
    Oh, Donghoon
    Park, Jeong-Sik
    Kim, Ji-Hwan
    Jang, Gil-Jin
    APPLIED SCIENCES-BASEL, 2021, 11 (01): : 1 - 17
  • [2] Speech recognition through phoneme segmentation and neural classification
    Maeran, O
    Piuri, V
    Gajani, GS
    IMTC/97 - IEEE INSTRUMENTATION & MEASUREMENT TECHNOLOGY CONFERENCE: SENSING, PROCESSING, NETWORKING, PROCEEDINGS VOLS 1 AND 2, 1997, : 1215 - 1220
  • [3] Phoneme recognition using speech image (spectrogram)
    Ahmadi, M
    Bailey, NJ
    Hoyle, BS
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 675 - 677
  • [4] Myoclectric signal classification for phoneme-based speech recognition
    Scheme, Erik J.
    Hudgins, Bernard
    Parker, Phillip A.
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2007, 54 (04) : 694 - 699
  • [5] PHONEME GROUPING FOR SPEECH RECOGNITION
    REDDY, DR
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1967, 41 (05): : 1295 - &
  • [6] Speech Emotion Recognition Using Spectrogram & Phoneme Embedding
    Yenigalla, Promod
    Kumar, Abhay
    Tripathi, Suraj
    Singh, Chirag
    Kar, Sibsambhu
    Vepa, Jithendra
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3688 - 3692
  • [7] Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model
    Khwaja, Mohammed Kamal
    Vikash, Peddakota
    Arulmozhivarman, P.
    Lui, Simon
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (04) : 895 - 905
  • [8] Speech coding and phoneme classification using MATLAB and NeuralWorks
    StGeorge, BA
    Wooten, EC
    Sellami, L
    FRONTIERS IN EDUCATION 1997 - 27TH ANNUAL CONFERENCE, PROCEEDINGS, BOLS I - III, 1997, : 12 - 12
  • [9] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
    Khonglah, Banriskhem K.
    Dey, Abhishek
    Prasanna, S. R. Mahadeva
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (02) : 643 - 663
  • [10] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
    Banriskhem K. Khonglah
    Abhishek Dey
    S. R. Mahadeva Prasanna
    Circuits, Systems, and Signal Processing, 2019, 38 : 643 - 663