Missing feature theory applied to robust speech recognition over IP network

被引:0
|
作者
Endo, T [1 ]
Kuroiwa, S
Nakamura, S
机构
[1] ATR, Spoken Language Translat Res Labs, Kyoto 6190288, Japan
[2] Univ Tokushima, Tokushima 7708506, Japan
来源
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2004年 / E87D卷 / 05期
关键词
DSR; data loss; data imputation; marginalization;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses problems involved in performing speech recognition over mobile and IP networks. The main problem is speech data loss caused by packet loss in the network. We present two missing-feature-based approaches that recover lost regions of speech data. These approaches are based on the reconstruction of missing frames or on marginal distributions. For comparison, we also use a packing method, which skips lost data. We evaluate these approaches with packet loss models. i.e., random loss and Gilbert loss models. The results show that the marginal-distributed-based technique is most effective for a packet loss environment; the degradation of word accuracy is only 5% when the packet loss rate is 30% and only 3% when mean burst loss length is 24 frames in the case of DSR front-end. The simple data imputation method is also effective in the case of clean speech.
引用
收藏
页码:1119 / 1126
页数:8
相关论文
共 50 条
  • [11] Feature extraction for robust speech recognition
    Dharanipragada, S
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 855 - 858
  • [12] Missing data techniques for robust speech recognition
    Cooke, M
    Morris, A
    Green, P
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 863 - 866
  • [13] Reconstruction of missing features for robust speech recognition
    Raj, B
    Seltzer, ML
    Stern, RM
    SPEECH COMMUNICATION, 2004, 43 (04) : 275 - 296
  • [14] MMSE-Based Missing-Feature Reconstruction With Temporal Modeling for Robust Speech Recognition
    Gonzalez, Jose A.
    Peinado, Antonio M.
    Ma, Ning
    Gomez, Angel M.
    Barker, Jon
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (03): : 624 - 635
  • [15] Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise
    Kim, Wooil
    Stern, Richard M.
    SPEECH COMMUNICATION, 2011, 53 (01) : 1 - 11
  • [16] Coarse speech recognition by audio-visual integration based on missing feature theory
    Koiwa, Tomoaki
    Nakadai, Kazuhiro
    Imura, Jun-ichi
    2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9, 2007, : 1757 - 1762
  • [17] Deep Neural Network Based Spectral Feature Mapping for Robust Speech Recognition
    Han, Kun
    He, Yanzhang
    Bagchi, Deblin
    Fosler-Lussier, Eric
    Wang, DeLiang
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2484 - 2488
  • [18] Investigation of speech recognition over IP channels
    Van Sciver, J
    Ma, JZ
    Vanpoucke, F
    Van Hamme, H
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3812 - 3815
  • [19] EFFECT OF FEATURE SMOOTHING FOR ROBUST SPEECH RECOGNITION
    Xiao, Xiong
    Chng, Eng Siong
    Lit, Haizhou
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 73 - 76
  • [20] ROBUST FEATURE EXTRACTORS FOR CONTINUOUS SPEECH RECOGNITION
    Alam, M. J.
    Kenny, P.
    Dumouchel, P.
    O'Shaughnessy, D.
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 944 - 948