DISCRIMINATIVE DEEP RECURRENT NEURAL NETWORKS FOR MONAURAL SPEECH SEPARATION

被引:0
|
作者
Wang, Guan-Xiang [1 ]
Hsu, Chung-Chien [1 ]
Chien, Jen-Tzung [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu 30010, Taiwan
关键词
deep learning; discriminative learning; neural network; monaural speech separation; FACTORIZATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural network is now a new trend towards solving different problems in speech processing. In this paper, we propose a discriminative deep recurrent neural network (DRNN) model for monaural speech separation. Our idea is to construct DRNN as a regression model to discover the deep structure and regularity for signal reconstruction from a mixture of two source spectra. To reinforce the discrimination capability between two separated spectra, we estimate DRNN separation parameters by minimizing an integrated objective function which consists of two measurements. One is the within source reconstruction errors due to the individual source spectra while the other conveys the discrimination information which preserves the mutual difference between two source spectra during the supervised training procedure. This discrimination information acts as a kind of regularization so as to maintain between-source separation in monaural source separation. In the experiments, we demonstrate the effectiveness of the proposed method for speech separation compared with the other methods.
引用
收藏
页码:2544 / 2548
页数:5
相关论文
共 50 条
  • [41] Monaural Speech Separation Using Dual-Output Deep Neural Network with Multiple Joint Constraint
    SUN Linhui
    LIANG Wenqing
    ZHANG Meng
    LI Ping’an
    Chinese Journal of Electronics, 2023, 32 (03) : 493 - 506
  • [42] Monaural Speech Separation Using Dual-Output Deep Neural Network with Multiple Joint Constraint
    Sun Linhui
    Liang Wenqing
    Zhang Meng
    Li Ping'an
    CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 493 - 506
  • [43] Audio Visual Speech Recognition Using Deep Recurrent Neural Networks
    Thanda, Abhinav
    Venkatesan, Shankar M.
    MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, MPRSS 2016, 2017, 10183 : 98 - 109
  • [44] Deep Recurrent Neural Networks in Speech Synthesis Using a Continuous Vocoder
    Al-Radhi, Mohammed Salah
    Csapo, Tamas Gabor
    Nemeth, Geza
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 282 - 291
  • [45] Speech Enhancement for Speaker Recognition Using Deep Recurrent Neural Networks
    Tkachenko, Maxim
    Yamshinin, Alexander
    Lyubimov, Nikolay
    Kotov, Mikhail
    Nastasenko, Marina
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 690 - 699
  • [46] Deep Elman recurrent neural networks for statistical parametric speech synthesis
    Achanta, Sivanand
    Gangashetty, Suryakanth V.
    SPEECH COMMUNICATION, 2017, 93 : 31 - 42
  • [47] NEURAL NETWORK BASED PHASE COMPENSATION METHODS ON MONAURAL SPEECH SEPARATION
    Wang, Chunpeng
    Zhu, Tie
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1384 - 1389
  • [48] Arabic Hate Speech Detection Using Deep Recurrent Neural Networks
    Al Anezi, Faisal Yousif
    APPLIED SCIENCES-BASEL, 2022, 12 (12):
  • [49] A Performance Evaluation of Several Deep Neural Networks for Reverberant Speech Separation
    Liu, Qingju
    Wang, Wenwu
    Jackson, Philip J. B.
    Safavi, Saeid
    2018 CONFERENCE RECORD OF 52ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2018, : 689 - 693
  • [50] Discriminative Training of Complex-valued Deep Recurrent Neural Network for Singing Voice Separation
    Lee, Yuan-Shan
    Yu, Kuo
    Chen, Sih-Huei
    Wang, Jia-Ching
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1327 - 1335