Binaural Deep Neural Network for Noise Robust Automatic Speech Recognition

被引:0
|
作者
Jiang, Yi [1 ]
Zu, Yuan-Yuan [1 ]
机构
[1] Quartermaster Equipment Res Inst, Beijing, Peoples R China
关键词
Deep Neural Network (DNN); Computational Auditory Scene Analysis (CASA); Automatic Speech Recognition (ASR); Ideal Parameter Mask;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Robust automatic speech recognition (ASR) is a challenge task, especially in noisy environments. The difference between the clean training speech model and the noisy speech model is a main factor to reduce the performance of ASR systems. The goal of a robust ASR system is getting the target speech energy distribution, which provides the discriminate information for the acoustic model. We use a binaural deep neural network (DNN) to estimate the energy of the target speech in the mixture through SNR estimation. Then the estimated target speech is used as the input of a convenient ASR system to improve the recognition accuracy. We use the ideal parameter mask as the DNN training goal, and cross entropy as the training cost function. Experiments show the robust ASR performance of the proposed algorithm with various signal to noise ratio conditions.
引用
收藏
页码:512 / 517
页数:6
相关论文
共 50 条
  • [21] Deep Q-network-based noise suppression for robust speech recognition
    Park, Tae-Jun
    Chang, Joon-Hyuk
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (05) : 2362 - 2373
  • [22] Deep Q-network-based noise suppression for robust speech recognition
    Park T.-J.
    Chang J.-H.
    Turkish Journal of Electrical Engineering and Computer Sciences, 2021, 25 (09) : 2362 - 2373
  • [23] Noise Adaptive Training for Robust Automatic Speech Recognition
    Kalinli, Ozlem
    Seltzer, Michael L.
    Droppo, Jasha
    Acero, Alex
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 1889 - 1901
  • [24] A Noise-Robust Speech Recognition System Based on Wavelet Neural Network
    Wang, Yiping
    Zhao, Zhefeng
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 392 - 397
  • [25] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Rehr, Robert
    Gerkmann, Timo
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
  • [26] Robust automatic speech recognition in impulsive noise environment
    Ding, P
    Cao, ZG
    CHINESE JOURNAL OF ELECTRONICS, 2005, 14 (01): : 165 - 168
  • [27] Noise-robust automatic speech recognition using a predictive echo state network
    Skowronski, Mark D.
    Harris, John G.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1724 - 1730
  • [28] Noise-robust automatic speech recognition using a discriminative echo state network
    Skowronski, Mark D.
    Harris, John G.
    2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 1771 - 1774
  • [29] EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION
    Baby, Deepak
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Van hamme, Hugo
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4485 - 4489
  • [30] Automatic Speech Recognition with Deep Neural Networks for Impaired Speech
    Espana-Bonet, Cristina
    Fonollosa, Jose A. R.
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2016, 2016, 10077 : 97 - 107