Multi-task Learning Deep Neural Networks For Speech Feature Denoising

被引：0

作者：

Huang, Bin ^{[1
]}

Ke, Dengfeng ^{[2
]}

Zheng, Hao ^{[2
]}

Xu, Bo ^{[2
]}

Xu, Yanyan ^{[1
]}

Su, Kaile ^{[3
]}

机构：

[1] Beijing Forestry Univ, Sch Informat Sci & Technol, Beijing, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

[3] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld, Australia

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

multi-task learning; feature denoising; deep neural networks; ENHANCEMENT;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Traditional automatic speech recognition (ASR) systems usually get a sharp performance drop when noise presents in speech. To make a robust ASR, we introduce a new model using the multi-task learning deep neural networks (MTL-DNN) to solve the speech denoising task in feature level. In this model, the networks are initialized by pre-training restricted Boltzmann machines (RBM) and fine-tuned by jointly learning multiple interactive tasks using a shared representation. In multi-task learning, we choose a noisy-clean speech pair fitting task as the primary task and separately explore two constraints as the secondary tasks: phone label and phone cluster. In experiments, the denoised speech is reconstructed by the MTL-DNN using the noisy speech as input and it is respectively evaluated by the DNN-hidden Markov model (HMM) based and the Gaussian Mixture Model (GMM)-HMM based ASR systems. Results show that, using the denoised speech, the word error rate (WER) is respectively reduced by 53.14% and 34.84% compared with baselines. The MTL-DNN model also outperforms the general single-task learning deep neural networks (STL-DNN) model with a performance improvement of 4.93% and 3.88% respectively.

引用

页码：2464 / 2468

页数：5

共 50 条

[1] Adversarial Multi-task Learning of Deep Neural Networks for Robust Speech Recognition
Shinohara, Yusuke
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2369 - 2372
[2] MULTI-TASK JOINT-LEARNING OF DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
Qian, Yanmin
Yin, Maofan
You, Yongbin
Yu, Kai
2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 310 - 316
[3] Adaptive Feature Aggregation in Deep Multi-Task Convolutional Neural Networks
Cui, Chaoran
Shen, Zhen
Huang, Jin
Chen, Meng
Xu, Mingliang
Wang, Meng
Yin, Yilong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2133 - 2144
[4] Deep Adaptive Feature Aggregation in Multi-task Convolutional Neural Networks
Shen, Zhen
Cui, Chaoran
Huang, Jin
Zong, Jian
Chen, Meng
Yin, Yilong
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 2213 - 2216
[5] DEEP NEURAL NETWORKS EMPLOYING MULTI-TASK LEARNING AND STACKED BOTTLENECK FEATURES FOR SPEECH SYNTHESIS
Wu, Zhizheng
Valentini-Botinhao, Cassia
Watts, Oliver
King, Simon
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4460 - 4464
[6] Evolving Deep Parallel Neural Networks for Multi-Task Learning
Wu, Jie
Sun, Yanan
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT II, 2022, 13156 : 517 - 531
[7] Deep Asymmetric Multi-task Feature Learning
Lee, Hae Beom
Yang, Eunho
Hwang, Sung Ju
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[8] Multi-Adaptive Optimization for multi-task learning with deep neural networks
Hervella, alvaro S.
Rouco, Jose
Novo, Jorge
Ortega, Marcos
NEURAL NETWORKS, 2024, 170 : 254 - 265
[9] Deep Convolutional Neural Networks for Multi-Instance Multi-Task Learning
Zeng, Tao
Ji, Shuiwang
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 579 - 588
[10] Cell tracking using deep neural networks with multi-task learning
He, Tao
Mao, Hua
Guo, Jixiang
Yi, Zhang
IMAGE AND VISION COMPUTING, 2017, 60 : 142 - 153

← 1 2 3 4 5 →