Robust Speech Recognition based on Multi-Objective Learning with GRU Network

被引:0
|
作者
Liu, Ming [1 ,2 ,3 ]
Wang, Yujun [2 ]
Yan, Zhaoyu [1 ]
Wang, Jing [1 ]
Xie, Xiang [1 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] Xiao Inc, Beijing, Peoples R China
[3] Xiaomi Corp, Speech Grp, Beijing, Peoples R China
关键词
DEEP NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper proposes a new scheme to execute the task of speech enhancement (SE) for recognition based on multi-objective learning method which uses three objectives in the gated recurrent unit (GRU) network training procedure. The first objective is the main target for the expected SE task by directly mapping the noisy log-power spectrum (LPS) features to clean Mel-frequency cepstral coefficients (MFCC) features. The second one is an auxiliary target to help improving the main one by learning additional information from the back-end acoustic model (AM). The third one is also an auxiliary target achieved by learning some information from mapping noisy LPS to clean LPS. The two auxiliary structures could help the original structure to optimize the network parameters by correcting the errors. This approach imposes more constraints on direct feature mapping and information passing from the acoustic model to the network, enabling the enhanced network to better serve the AM. The experimental results show that the new multi-objective scheme with joint feature mapping and the posterior probability learning method improves the performance of SE. And this scheme significantly lowers the Character Error Rate (CER) of the AM compared to the baseline deep neural network (DNN) network(1).
引用
收藏
页码:181 / 185
页数:5
相关论文
共 50 条
  • [1] MULTI-OBJECTIVE MULTI-TASK LEARNING ON RNNLM FOR SPEECH RECOGNITION
    Song, Minguang
    Zhao, Yunxin
    Wang, Shaojun
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 197 - 203
  • [2] Multi-objective recognition based on deep learning
    Liu, Xin
    Wu, Junhui
    Man, Yiyun
    Xu, Xibao
    Guo, Jifeng
    AIRCRAFT ENGINEERING AND AEROSPACE TECHNOLOGY, 2020, 92 (08): : 1185 - 1193
  • [3] Multi-objective based multi-channel speech enhancement with BiLSTM network
    Cui, Xingyue
    Chen, Zhe
    Yin, Fuliang
    APPLIED ACOUSTICS, 2021, 177
  • [4] Robust Multi-objective Scheduling in an Evaporation Network
    Palacin, Carlos G.
    Luis Pitarch, Jose
    de Prada, Cesar
    Mendez, Carlos A.
    2017 25TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2017, : 666 - 671
  • [5] Learning Multi-Objective Network Optimizations
    Lee, Hoon
    Lee, Sang Hyun
    Quek, Tony Q. S.
    2022 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2022, : 91 - 96
  • [6] Multi-View and Multi-Objective Semi-Supervised Learning for HMM-Based Automatic Speech Recognition
    Cui, Xiaodong
    Huang, Jing
    Chien, Jen-Tzung
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (07): : 1923 - 1935
  • [7] Distant-talking Speech Recognition Based on Multi-objective Learning using Phase and Magnitude-based Feature
    Li, Dongbo
    Wang, Longbiao
    Dang, Jianwu
    Ge, Meng
    Guan, Haotian
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 394 - 398
  • [8] Robust Multi-Objective Congestion Management in Distribution Network
    Khan, Omniyah Gul M.
    Youssef, Amr
    Salama, Magdy
    El-Saadany, Ehab
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2023, 38 (04) : 3568 - 3579
  • [9] Multi-objective Learning and Mask-based Post-processing for Deep Neural Network based Speech Enhancement
    Xu, Yong
    Du, Jun
    Huang, Zhen
    Dai, Li-Rong
    Lee, Chin-Hui
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1508 - 1512
  • [10] A multi-objective approach to RBF network learning
    Kokshenev, Illya
    Braga, Antonio Padua
    NEUROCOMPUTING, 2008, 71 (7-9) : 1203 - 1209