Robust Speech Recognition based on Multi-Objective Learning with GRU Network

被引:0
|
作者
Liu, Ming [1 ,2 ,3 ]
Wang, Yujun [2 ]
Yan, Zhaoyu [1 ]
Wang, Jing [1 ]
Xie, Xiang [1 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] Xiao Inc, Beijing, Peoples R China
[3] Xiaomi Corp, Speech Grp, Beijing, Peoples R China
关键词
DEEP NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper proposes a new scheme to execute the task of speech enhancement (SE) for recognition based on multi-objective learning method which uses three objectives in the gated recurrent unit (GRU) network training procedure. The first objective is the main target for the expected SE task by directly mapping the noisy log-power spectrum (LPS) features to clean Mel-frequency cepstral coefficients (MFCC) features. The second one is an auxiliary target to help improving the main one by learning additional information from the back-end acoustic model (AM). The third one is also an auxiliary target achieved by learning some information from mapping noisy LPS to clean LPS. The two auxiliary structures could help the original structure to optimize the network parameters by correcting the errors. This approach imposes more constraints on direct feature mapping and information passing from the acoustic model to the network, enabling the enhanced network to better serve the AM. The experimental results show that the new multi-objective scheme with joint feature mapping and the posterior probability learning method improves the performance of SE. And this scheme significantly lowers the Character Error Rate (CER) of the AM compared to the baseline deep neural network (DNN) network(1).
引用
收藏
页码:181 / 185
页数:5
相关论文
共 50 条
  • [31] ROSE: A Recognition-Oriented Speech Enhancement Framework in Air Traffic Control Using Multi-Objective Learning
    Yu, Xincheng
    Guo, Dongyue
    Zhang, Jianwei
    Lin, Yi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3365 - 3378
  • [32] Deep Neural Network Based Speech Separation for Robust Speech Recognition
    Tu Yanhui
    Jun, Du
    Xu Yong
    Dai Lirong
    Chin-Hui, Lee
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 532 - 536
  • [33] Perceptual Characteristics Based Multi-objective Model for Speech Enhancement
    Peng, Chiang-Jen
    Shen, Yih-Liang
    Chan, Yun-Ju
    Yu, Cheng
    Tsao, Yu
    Chi, Tai-Shih
    INTERSPEECH 2022, 2022, : 211 - 215
  • [34] A scenario-based robust possibilistic model for a multi-objective electronic reverse logistics network
    Tosarkani, Babak Mohamadpour
    Amin, Saman Hassanzadeh
    Zolfagharinia, Hossein
    INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2020, 224
  • [35] Multi-Objective Optimization with Artificial Neural Network Based Robust Paddy Yield Prediction Model
    Muthukumaran, S.
    Geetha, P.
    Ramaraj, E.
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (01): : 215 - 230
  • [36] Robust Lightweight Neural Network Architecture Search Based on Multi-objective Particle Swarm Optimization
    Chen, Peipei
    Yan, Li
    Du, Yi
    ADVANCES IN SWARM INTELLIGENCE, PT I, ICSI 2024, 2024, 14788 : 430 - 441
  • [37] A Speech Enhancement Neural Network Architecture with SNR-Progressive Multi-Target Learning for Robust Speech Recognition
    Zhou, Nan
    Du, Jun
    Tu, Yan-Hui
    Gao, Tian
    Lee, Chin-Hui
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 873 - 877
  • [38] A multi-objective evolutionary algorithm for robust positive-unlabeled learning
    Qiu, Jianfeng
    Tang, Qi
    Tan, Ming
    Li, Kaixuan
    Xie, Juan
    Cai, Xiaoqiang
    Cheng, Fan
    INFORMATION SCIENCES, 2024, 678
  • [39] An efficient multi-objective learning algorithm for RBF neural network
    Kokshenev, Illya
    Braga, Antonio Padua
    NEUROCOMPUTING, 2010, 73 (16-18) : 2799 - 2808
  • [40] Research on Multi-Objective Robust Design
    Xu, Huanwei
    Huang, Hong-Zhong
    Wang, Zhonglai
    Zheng, Bin
    Meng, Debiao
    2011 INTERNATIONAL CONFERENCE ON QUALITY, RELIABILITY, RISK, MAINTENANCE, AND SAFETY ENGINEERING (ICQR2MSE), 2011, : 885 - 890