Speech Enhancement Using Joint DNN-NMF Model Learned with Multi-Objective Frequency Differential Spectrum Loss Function

被引:2
|
作者
Pashaian, Matin [1 ]
Seyedin, Sanaz [1 ]
机构
[1] Amirkabir Univ Technol, Dept Elect Engn, Speech Proc Res Lab, Tehran, Iran
基金
美国国家科学基金会;
关键词
SEPARATION; QUALITY; SPARSE; NOISE;
D O I
10.1049/2024/8881007
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We propose a multi-objective joint model of non-negative matrix factorization (NMF) and deep neural network (DNN) with a new loss function for speech enhancement. The proposed loss function (LMOFD) is a weighted combination of a frequency differential spectrum mean squared error (MSE)-based loss function (LFD) and a multi-objective MSE loss function LMO. The conventional MSE loss function computes the discrepancy between the estimated speech and clean speech across all frequencies, disregarding the process of changing amplitude in the frequency domain which contains valuable information. The differential spectrum representation retains spectral peaks that carry important information. Using this representation helps to ensure that this information in the speech signal is reserved. Also, on the other hand, noise spectra typically have a flat shape and as the differential operation makes the flat spectral partly close to zero, the differential spectrum is resistant to noises with smooth structures. Thus, we propose using a frequency-differentiated loss function that considers the magnitude spectrum differentiations between the neighboring frequency bins in each time frame. This approach maintains the spectrum variations of the objective signal in the frequency domain, which can effectively reduce the noise deterioration effects. The multi-objective MSE term LMO is a combined two-loss function related to the NMF coefficients which are the intermediate output targets, and the original spectral signals as the actual output targets. The use of encoded NMF coefficients as low-dimensional structural features for DNN serves as prior knowledge and helps the learning process. LMO is used beside LFD to take advantage of both the properties of the original and the differential spectrum in the training loss function. Moreover, a DNN-based noise classification and fusion strategy (NCF) is proposed to exploit a discriminative model for noise reduction. The experiments reveal the improvements of the proposed approach compared to the previous methods.
引用
收藏
页数:10
相关论文
共 29 条
  • [21] Genetic algorithm (GA) for multivariable surface grinding process optimisation using a multi-objective function model
    Saravanan, R.
    Sachithanandam, M.
    1600, Springer-Verlag London Ltd (17):
  • [22] Genetic Algorithm (GA) for Multivariable Surface Grinding Process Optimisation Using a Multi-objective Function Model
    R. Saravanan
    M. Sachithanandam
    The International Journal of Advanced Manufacturing Technology, 2001, 17 : 330 - 338
  • [23] A Parallel-Data-Free Speech Enhancement Method Using Multi-Objective Learning Cycle-Consistent Generative Adversarial Network
    Xiang, Yang
    Bao, Changchun
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1826 - 1838
  • [24] A Contrast Enhancement Model for X-Ray Mammograms Using Modified Local s-Curve Transformation Based on Multi-Objective Optimization
    El Malali, Hamid
    Assir, Abdelhadi
    Bhateja, Vikrant
    Mouhsen, Azeddine
    Harmouchi, Mohammed
    IEEE SENSORS JOURNAL, 2021, 21 (10) : 11543 - 11554
  • [25] Enhancement of bi-objective function model to master straight-line facilities sequences using frequency from-to chart
    Gamal, Shrouq
    El-Nemr, Mohamed K.
    El-Kassas, Ahmed M.
    JOURNAL OF FACILITIES MANAGEMENT, 2021, 19 (03) : 327 - 338
  • [26] Multi-Objective Stochastic Paint Optimizer for Solving Dynamic Economic Emission Dispatch with Transmission Loss Prediction Using Random Forest Machine Learning Model
    Sundaram, Arunachalam
    Alkhaldi, Nasser S.
    ENERGIES, 2024, 17 (04)
  • [27] Broadband sound transmission loss enhancement of an arbitrary-thick hybrid smart composite plate using multi-objective particle swarm optimization-based active control
    Hasheminejad, Seyyed M.
    Hakimi, Arash
    Keshavarzpour, Hemad
    JOURNAL OF INTELLIGENT MATERIAL SYSTEMS AND STRUCTURES, 2018, 29 (08) : 1724 - 1747
  • [28] An effective model for network selection and resource allocation in 5G heterogeneous network using hybrid heuristic-assisted multi-objective function
    Urooj, Shabana
    Arunachalam, Rajesh
    Alawad, Mohamad A.
    Tripathi, Kuldeep Narayan
    Sukumaran, Damodaran
    Ilango, Poonguzhali
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 248
  • [29] Multi-objective Optimization Using Taguchi's Loss Function-Based Principal Component Analysis in Electrochemical Discharge Machining of Micro-channels on Borosilicate Glass with Direct and Hybrid Electrolytes
    Ranganayakulu, Jinka
    Srihari, P. V.
    ADVANCES IN MANUFACTURING PROCESSES, ICEMMM 2018, 2019, : 349 - 360