Speech Enhancement Using Joint DNN-NMF Model Learned with Multi-Objective Frequency Differential Spectrum Loss Function

被引:2
|
作者
Pashaian, Matin [1 ]
Seyedin, Sanaz [1 ]
机构
[1] Amirkabir Univ Technol, Dept Elect Engn, Speech Proc Res Lab, Tehran, Iran
基金
美国国家科学基金会;
关键词
SEPARATION; QUALITY; SPARSE; NOISE;
D O I
10.1049/2024/8881007
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We propose a multi-objective joint model of non-negative matrix factorization (NMF) and deep neural network (DNN) with a new loss function for speech enhancement. The proposed loss function (LMOFD) is a weighted combination of a frequency differential spectrum mean squared error (MSE)-based loss function (LFD) and a multi-objective MSE loss function LMO. The conventional MSE loss function computes the discrepancy between the estimated speech and clean speech across all frequencies, disregarding the process of changing amplitude in the frequency domain which contains valuable information. The differential spectrum representation retains spectral peaks that carry important information. Using this representation helps to ensure that this information in the speech signal is reserved. Also, on the other hand, noise spectra typically have a flat shape and as the differential operation makes the flat spectral partly close to zero, the differential spectrum is resistant to noises with smooth structures. Thus, we propose using a frequency-differentiated loss function that considers the magnitude spectrum differentiations between the neighboring frequency bins in each time frame. This approach maintains the spectrum variations of the objective signal in the frequency domain, which can effectively reduce the noise deterioration effects. The multi-objective MSE term LMO is a combined two-loss function related to the NMF coefficients which are the intermediate output targets, and the original spectral signals as the actual output targets. The use of encoded NMF coefficients as low-dimensional structural features for DNN serves as prior knowledge and helps the learning process. LMO is used beside LFD to take advantage of both the properties of the original and the differential spectrum in the training loss function. Moreover, a DNN-based noise classification and fusion strategy (NCF) is proposed to exploit a discriminative model for noise reduction. The experiments reveal the improvements of the proposed approach compared to the previous methods.
引用
收藏
页数:10
相关论文
共 29 条
  • [1] MODEL: Multi-Objective Differential Evolution with Leadership Enhancement
    Bourennani, Farid
    Rahnamayan, Shahryar
    Naterer, Greg F.
    2014 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2014, : 1131 - 1138
  • [2] Multi-objective Noisy based Deep Feature Loss for Speech Enhancement
    Pilarczyk, Rafa L.
    Skarbek, Wladyslaw
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2019, 2019, 11176
  • [3] Perceptual Characteristics Based Multi-objective Model for Speech Enhancement
    Peng, Chiang-Jen
    Shen, Yih-Liang
    Chan, Yun-Ju
    Yu, Cheng
    Tsao, Yu
    Chi, Tai-Shih
    INTERSPEECH 2022, 2022, : 211 - 215
  • [4] A MAXIMUM LIKELIHOOD APPROACH TO MULTI-OBJECTIVE LEARNING USING GENERALIZED GAUSSIAN DISTRIBUTIONS FOR DNN-BASED SPEECH ENHANCEMENT
    Niu, Shu-Tong
    Du, Jun
    Chai, Li
    Lee, Chin-Hui
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6229 - 6233
  • [5] Development of statistical estimators for speech enhancement using multi-objective grey wolf optimizer
    Dash, Tusar Kanti
    Solanki, Sandeep Singh
    Panda, Ganapati
    Satapathy, Suresh Chandra
    EVOLUTIONARY INTELLIGENCE, 2021, 14 (02) : 767 - 778
  • [6] Development of statistical estimators for speech enhancement using multi-objective grey wolf optimizer
    Tusar Kanti Dash
    Sandeep Singh Solanki
    Ganapati Panda
    Suresh Chandra Satapathy
    Evolutionary Intelligence, 2021, 14 : 767 - 778
  • [7] NARMAX Model Identification Using Multi-Objective Optimization Differential Evolution
    Zakaria, Mohd Zakimi
    Mansor, Zakwan
    Noe, Azuwir Mohd
    Saad, Mohd Sazli
    Baharudin, Mohamad Ezral
    Ahmad, Robiah
    INTERNATIONAL JOURNAL OF INTEGRATED ENGINEERING, 2018, 10 (07): : 188 - 203
  • [8] Tree Model Reconstruction Innovization Using Multi-objective Differential Evolution
    Zamuda, Ales
    Brest, Janez
    2012 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2012,
  • [9] Spectrum Allocation in Cognitive Radio Networks using Multi-Objective Differential Evolution Algorithm
    Anumandla, Kiran Kumar
    Akella, Bharadwaj
    Sabat, Samrat L.
    Udgata, Siba K.
    2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN) 2015, 2015, : 264 - 269
  • [10] A multi-objective learning speech enhancement algorithm based on IRM post-processing with joint estimation of SCNN and TCNN
    Li, Ruwei
    Sun, Xiaoyue
    Li, Tao
    Zhao, Fengnian
    DIGITAL SIGNAL PROCESSING, 2020, 101