Missing feature theory applied to robust speech recognition over IP network

被引:0
|
作者
Endo, T [1 ]
Kuroiwa, S
Nakamura, S
机构
[1] ATR, Spoken Language Translat Res Labs, Kyoto 6190288, Japan
[2] Univ Tokushima, Tokushima 7708506, Japan
来源
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2004年 / E87D卷 / 05期
关键词
DSR; data loss; data imputation; marginalization;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses problems involved in performing speech recognition over mobile and IP networks. The main problem is speech data loss caused by packet loss in the network. We present two missing-feature-based approaches that recover lost regions of speech data. These approaches are based on the reconstruction of missing frames or on marginal distributions. For comparison, we also use a packing method, which skips lost data. We evaluate these approaches with packet loss models. i.e., random loss and Gilbert loss models. The results show that the marginal-distributed-based technique is most effective for a packet loss environment; the degradation of word accuracy is only 5% when the packet loss rate is 30% and only 3% when mean burst loss length is 24 frames in the case of DSR front-end. The simple data imputation method is also effective in the case of clean speech.
引用
收藏
页码:1119 / 1126
页数:8
相关论文
共 50 条
  • [41] ROBUST FEATURE SPACE ADAPTATION FOR TELEPHONY SPEECH RECOGNITION
    Lei, Xin
    Hamaker, Jon
    He, Xiaodong
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 773 - +
  • [42] Discriminative temporal feature extraction for robust speech recognition
    Shen, JL
    ELECTRONICS LETTERS, 1997, 33 (19) : 1598 - 1600
  • [43] Robust automatic speech recognition with missing and unreliable acoustic data
    Cooke, M
    Green, P
    Josifovski, L
    Vizinho, A
    SPEECH COMMUNICATION, 2001, 34 (03) : 267 - 285
  • [44] Distinctive phonetic feature extraction for robust speech recognition
    Fukuda, T
    Yamamoto, W
    Nitta, T
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 25 - 28
  • [45] Hierarchical stochastic feature matching for robust speech recognition
    Jiang, H
    Soong, F
    Lee, CH
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 217 - 220
  • [46] Approach of feature with confident weight for robust speech recognition
    Ge, YB
    Song, J
    Ge, LN
    Shirai, K
    2004 IEEE 6TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2004, : 11 - 14
  • [47] Bounded cepstral marginalization of missing data for robust speech recognition
    Kafoori, Kian Ebrahim
    Ahadi, Seyed Mohammad
    COMPUTER SPEECH AND LANGUAGE, 2016, 36 : 1 - 23
  • [48] Missing-Feature Reconstruction by Leveraging Temporal Spectral Correlation for Robust Speech Recognition in Background Noise Conditions
    Kim, Wooil
    Hansen, John H. L.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 2111 - 2120
  • [49] Improvement of robot audition by interfacing sound source separation and automatic speech recognition with missing feature theory
    Yamamoto, S
    Nakadai, K
    Tsujino, H
    Yokoyama, T
    Okuno, HG
    2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 1517 - 1523
  • [50] Robust voice recognition over IP and mobile networks
    Milner, B
    PIMRC 2000: 11TH IEEE INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2000, : 1197 - 1201