Multi-task Learning Deep Neural Networks For Speech Feature Denoising

被引:0
|
作者
Huang, Bin [1 ]
Ke, Dengfeng [2 ]
Zheng, Hao [2 ]
Xu, Bo [2 ]
Xu, Yanyan [1 ]
Su, Kaile [3 ]
机构
[1] Beijing Forestry Univ, Sch Informat Sci & Technol, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[3] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld, Australia
关键词
multi-task learning; feature denoising; deep neural networks; ENHANCEMENT;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Traditional automatic speech recognition (ASR) systems usually get a sharp performance drop when noise presents in speech. To make a robust ASR, we introduce a new model using the multi-task learning deep neural networks (MTL-DNN) to solve the speech denoising task in feature level. In this model, the networks are initialized by pre-training restricted Boltzmann machines (RBM) and fine-tuned by jointly learning multiple interactive tasks using a shared representation. In multi-task learning, we choose a noisy-clean speech pair fitting task as the primary task and separately explore two constraints as the secondary tasks: phone label and phone cluster. In experiments, the denoised speech is reconstructed by the MTL-DNN using the noisy speech as input and it is respectively evaluated by the DNN-hidden Markov model (HMM) based and the Gaussian Mixture Model (GMM)-HMM based ASR systems. Results show that, using the denoised speech, the word error rate (WER) is respectively reduced by 53.14% and 34.84% compared with baselines. The MTL-DNN model also outperforms the general single-task learning deep neural networks (STL-DNN) model with a performance improvement of 4.93% and 3.88% respectively.
引用
收藏
页码:2464 / 2468
页数:5
相关论文
共 50 条
  • [1] Adversarial Multi-task Learning of Deep Neural Networks for Robust Speech Recognition
    Shinohara, Yusuke
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2369 - 2372
  • [2] MULTI-TASK JOINT-LEARNING OF DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Qian, Yanmin
    Yin, Maofan
    You, Yongbin
    Yu, Kai
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 310 - 316
  • [3] Adaptive Feature Aggregation in Deep Multi-Task Convolutional Neural Networks
    Cui, Chaoran
    Shen, Zhen
    Huang, Jin
    Chen, Meng
    Xu, Mingliang
    Wang, Meng
    Yin, Yilong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2133 - 2144
  • [4] Deep Adaptive Feature Aggregation in Multi-task Convolutional Neural Networks
    Shen, Zhen
    Cui, Chaoran
    Huang, Jin
    Zong, Jian
    Chen, Meng
    Yin, Yilong
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 2213 - 2216
  • [5] DEEP NEURAL NETWORKS EMPLOYING MULTI-TASK LEARNING AND STACKED BOTTLENECK FEATURES FOR SPEECH SYNTHESIS
    Wu, Zhizheng
    Valentini-Botinhao, Cassia
    Watts, Oliver
    King, Simon
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4460 - 4464
  • [6] Evolving Deep Parallel Neural Networks for Multi-Task Learning
    Wu, Jie
    Sun, Yanan
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT II, 2022, 13156 : 517 - 531
  • [7] Deep Asymmetric Multi-task Feature Learning
    Lee, Hae Beom
    Yang, Eunho
    Hwang, Sung Ju
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [8] Multi-Adaptive Optimization for multi-task learning with deep neural networks
    Hervella, alvaro S.
    Rouco, Jose
    Novo, Jorge
    Ortega, Marcos
    NEURAL NETWORKS, 2024, 170 : 254 - 265
  • [9] Deep Convolutional Neural Networks for Multi-Instance Multi-Task Learning
    Zeng, Tao
    Ji, Shuiwang
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 579 - 588
  • [10] Cell tracking using deep neural networks with multi-task learning
    He, Tao
    Mao, Hua
    Guo, Jixiang
    Yi, Zhang
    IMAGE AND VISION COMPUTING, 2017, 60 : 142 - 153