A MAXIMUM LIKELIHOOD APPROACH TO MULTI-OBJECTIVE LEARNING USING GENERALIZED GAUSSIAN DISTRIBUTIONS FOR DNN-BASED SPEECH ENHANCEMENT

被引:0
|
作者
Niu, Shu-Tong [1 ]
Du, Jun [1 ]
Chai, Li [1 ]
Lee, Chin-Hui [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
multi-objective learning; maximum likelihood; deep neural network; shape factors update; generalized Gaussian distribution; CONVOLUTIONAL NEURAL-NETWORK; FEATURES;
D O I
10.1109/icassp40776.2020.9053995
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The multi-objective learning using minimum mean squared error criterion for DNN-based speech enhancement (MMSE-MOL-DNN) has been demonstrated to achieve better performance than single output DNN. However, one problem of MMSE-MOL-DNN is that the prediction error values on different targets have a very broad dynamic range, causing difficulty in DNN training. In this paper, we extend the maximum likelihood approach proposed in our previous work [1] to the multi-objective learning for DNN-based speech enhancement (ML-MOL-DNN) to achieve the automatic adjustment of the dynamic range of prediction error values on different targets. The conditional likelihood function to be maximized is derived under the generalized Gaussian distribution (GGD) error model. Moreover, the control of the dynamic range of the prediction error values on different targets is achieved by the scale factors in GGD. Furthermore, we propose a method to update the shape factors automatically utilizing the one-to-one mapping between the kurtosis and shape factor in GGD instead of manual adjustment. The experimental results show that our ML-MOL-DNN can achieve better performance than MMSE-MOL-DNN in terms of different objective measures.
引用
收藏
页码:6229 / 6233
页数:5
相关论文
共 50 条
  • [1] A DNN-based Multi-Objective Precoding for Gaussian MIMO Networks
    Zhang, Xinliang
    Vaezi, Mojtaba
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [2] A Maximum Likelihood Approach to SNR-Progressive Learning Using Generalized Gaussian Distribution for LSTM-Based Speech Enhancement
    Zhang, Xiao-Qi
    Du, Jun
    Chai, Li
    Lee, Chin-Hui
    INTERSPEECH 2021, 2021, : 2701 - 2705
  • [3] Multi-Objective DNN-Based Precoder for MIMO Communications
    Zhang, Xinliang
    Vaezi, Mojtaba
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (07) : 4476 - 4488
  • [4] DNN-BASED SPEECH ENHANCEMENT USING MBE MODEL
    Huang, Qizheng
    Bao, Changchun
    Wang, Xianyun
    Xiang, Yang
    2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 196 - 200
  • [5] AN OBJECTIVE EVALUATION OF HEARING AIDS AND DNN-BASED BINAURAL SPEECH ENHANCEMENT IN COMPLEX ACOUSTIC SCENES
    Guso, Enric
    Luberadzka, Joanna
    Baig, Marti
    Sayin, Umut
    Serra, Xavier
    2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [6] Using Generalized Gaussian Distributions to Improve Regression Error Modeling for Deep Learning-Based Speech Enhancement
    Chai, Li
    Du, Jun
    Liu, Qing-Feng
    Lee, Chin-Hui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 1919 - 1931
  • [7] Towards More Efficient DNN-Based Speech Enhancement Using Quantized Correlation Mask
    Abdullah, Salinna
    Zamani, Majid
    Demosthenous, Andreas
    IEEE ACCESS, 2021, 9 : 24350 - 24362
  • [8] ON USING HETEROGENEOUS DATA FOR VEHICLE-BASED SPEECH RECOGNITION: A DNN-BASED APPROACH
    Feng, Xue
    Richardson, Brigitte
    Amman, Scott
    Glass, James
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4385 - 4389
  • [9] Perceptual Characteristics Based Multi-objective Model for Speech Enhancement
    Peng, Chiang-Jen
    Shen, Yih-Liang
    Chan, Yun-Ju
    Yu, Cheng
    Tsao, Yu
    Chi, Tai-Shih
    INTERSPEECH 2022, 2022, : 211 - 215
  • [10] ON GENERATING MIXING NOISE SIGNALS WITH BASIS FUNCTIONS FOR SIMULATING NOISY SPEECH AND LEARNING DNN-BASED SPEECH ENHANCEMENT MODELS
    Wen, Shi-Xue
    Du, Jun
    Lee, Chin-Hui
    2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,