A MAXIMUM LIKELIHOOD APPROACH TO MULTI-OBJECTIVE LEARNING USING GENERALIZED GAUSSIAN DISTRIBUTIONS FOR DNN-BASED SPEECH ENHANCEMENT

被引:0
|
作者
Niu, Shu-Tong [1 ]
Du, Jun [1 ]
Chai, Li [1 ]
Lee, Chin-Hui [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
multi-objective learning; maximum likelihood; deep neural network; shape factors update; generalized Gaussian distribution; CONVOLUTIONAL NEURAL-NETWORK; FEATURES;
D O I
10.1109/icassp40776.2020.9053995
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The multi-objective learning using minimum mean squared error criterion for DNN-based speech enhancement (MMSE-MOL-DNN) has been demonstrated to achieve better performance than single output DNN. However, one problem of MMSE-MOL-DNN is that the prediction error values on different targets have a very broad dynamic range, causing difficulty in DNN training. In this paper, we extend the maximum likelihood approach proposed in our previous work [1] to the multi-objective learning for DNN-based speech enhancement (ML-MOL-DNN) to achieve the automatic adjustment of the dynamic range of prediction error values on different targets. The conditional likelihood function to be maximized is derived under the generalized Gaussian distribution (GGD) error model. Moreover, the control of the dynamic range of the prediction error values on different targets is achieved by the scale factors in GGD. Furthermore, we propose a method to update the shape factors automatically utilizing the one-to-one mapping between the kurtosis and shape factor in GGD instead of manual adjustment. The experimental results show that our ML-MOL-DNN can achieve better performance than MMSE-MOL-DNN in terms of different objective measures.
引用
收藏
页码:6229 / 6233
页数:5
相关论文
共 50 条
  • [41] MAXIMUM LIKELIHOOD BASED NOISE COVARIANCE MATRIX ESTIMATION FOR MULTI-MICROPHONE SPEECH ENHANCEMENT
    Kjems, Ulrik
    Jensen, Jesper
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 295 - 299
  • [42] Experimental Evaluation of Speech Enhancement for In-Car Environment Using Blind Source Separation and DNN-based Noise Suppression
    Takeuchi, Yutsuki
    Nakashima, Taishi
    Ono, Nobutaka
    Takazawa, Takashi
    Shimanoe, Shuhei
    Tsuchiya, Yoshinori
    2024 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2024,
  • [43] Development of statistical estimators for speech enhancement using multi-objective grey wolf optimizer
    Dash, Tusar Kanti
    Solanki, Sandeep Singh
    Panda, Ganapati
    Satapathy, Suresh Chandra
    EVOLUTIONARY INTELLIGENCE, 2021, 14 (02) : 767 - 778
  • [44] Development of statistical estimators for speech enhancement using multi-objective grey wolf optimizer
    Tusar Kanti Dash
    Sandeep Singh Solanki
    Ganapati Panda
    Suresh Chandra Satapathy
    Evolutionary Intelligence, 2021, 14 : 767 - 778
  • [45] Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Likelihood for Speech Recognition
    Du, Jun
    Hu, Yu
    Jiang, Hui
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 2091 - 2100
  • [46] WAVELET-BASED DECOMPOSITION OF F0 AS A SECONDARY TASK FOR DNN-BASED SPEECH SYNTHESIS WITH MULTI-TASK LEARNING
    Ribeiro, Manuel Sam
    Watts, Oliver
    Yamagishi, Junichi
    Clark, Robert A. J.
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5525 - 5529
  • [47] A multi-objective learning speech enhancement algorithm based on IRM post-processing with joint estimation of SCNN and TCNN
    Li, Ruwei
    Sun, Xiaoyue
    Li, Tao
    Zhao, Fengnian
    DIGITAL SIGNAL PROCESSING, 2020, 101
  • [48] Multi-objective data enhancement for deep learning-based ultrasound analysis
    Piao, Chengkai
    Lv, Mengyue
    Wang, Shujie
    Zhou, Rongyan
    Wang, Yuchen
    Wei, Jinmao
    Liu, Jian
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [49] Multi-objective data enhancement for deep learning-based ultrasound analysis
    Chengkai Piao
    Mengyue Lv
    Shujie Wang
    Rongyan Zhou
    Yuchen Wang
    Jinmao Wei
    Jian Liu
    BMC Bioinformatics, 23
  • [50] Multi-objective design of aircraft maintenance using Gaussian process learning and adaptive sampling
    Lee, Juseong
    Mitici, Mihaela
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2022, 218