Research on Speech Enhancement Algorithm of Multiresolution Cochleagram Based on Skip Connection Deep Neural Network

被引:4
|
作者
Lan, Chaofeng [1 ]
Wang, YuQiao [1 ]
Zhang, Lei [2 ]
Liu, Chundong [1 ]
Lin, Xiaojia [1 ]
机构
[1] Harbin Univ Sci & Technol, Coll Measurement & Commun Engn, Harbin 150080, Peoples R China
[2] Beidahuang Ind Grp Gen Hosp, Harbin 150088, Peoples R China
关键词
MASK;
D O I
10.1155/2022/5208372
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The speech enhancement effect of traditional deep learning algorithms is not ideal under low signal-to-noise ratios (SNR). Skip connections-deep neural network (Skip-DNN) improves the traditional deep neural network (DNN) by adding skip connections between each layer of the neural network to solve the degradation problem of DNN. In this paper, the Multiresolution Cochleagram (MRCG) features in the gammachirp transform domain are denoised to obtain the improved MRCG (I-MRCG). The noise reduction method adopts the Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator (MMSE-STSA) and takes I-MRCG as the input feature and Skip-DNN as the training network to improve the speech enhancement effect of the model. This paper also proposes an improved source-to-distortion ratio (SDR) loss function. When the loss function uses the improved SDR, it will improve the performance of Skip-DNN speech enhancement model. The experiments in this paper are performed on the Edinburgh dataset. When using I-MRCG as the input feature of Skip-DNN, the average perceptual evaluation of speech quality (PESQ) is 2.9137, and the average short-time objective intelligibility (STOI) is 0.8515. Compared with MRCG as Skip-DNN input features, the improvements are 0.91% and 0.71%, respectively. When the improved SDR is used as the loss function of the speech model, the average PESQ is 2.9699 and the average STOI is 0.8547. Compared with other loss functions, the improved SDR has a better enhancement effect when used as the loss function of the speech enhancement model.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Speech Enhancement Algorithm Combining Cochlear Features and Deep Neural Network with Skip Connections
    Lan, Chaofeng
    Wang, Yuqiao
    Zhang, Lei
    Yu, Zelong
    Liu, Chundong
    Guo, Xiaoxia
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2023, 95 (08): : 979 - 989
  • [2] Speech Enhancement Algorithm Combining Cochlear Features and Deep Neural Network with Skip Connections
    Chaofeng Lan
    Yuqiao Wang
    Lei Zhang
    Zelong Yu
    Chundong Liu
    Xiaoxia Guo
    Journal of Signal Processing Systems, 2023, 95 : 979 - 989
  • [3] SPEECH ENHANCEMENT BASED ON DEEP NEURAL NETWORKS WITH SKIP CONNECTIONS
    Tu, Ming
    Zhang, Xianxian
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5565 - 5569
  • [4] RESEARCH ON ENGLISH SPEECH ENHANCEMENT ALGORITHM BASED ON IMPROVED SPECTRAL SUBTRACTION AND DEEP NEURAL NETWORK
    Zhou, Qiaoling
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2020, 16 (05): : 1711 - 1723
  • [5] Speech Enhancement based on Deep Convolutional Neural Network
    Nuthakki, Ramesh
    Masanta, Payel
    Yukta, T. N.
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 770 - 775
  • [6] Supervised speech enhancement based on deep neural network
    Saleem, Nasir
    Khattak, Muhammad Irfan
    Qazi, Abdul Baser
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (04) : 5187 - 5201
  • [7] Speech Enhancement using Convolutional Neural Network with Skip Connections
    Shi, Yupeng
    Rong, Weicong
    Zheng, Nengheng
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 6 - 10
  • [8] An optimization method for speech enhancement based on deep neural network
    Sun, Haixia
    Li, Sikun
    3RD INTERNATIONAL CONFERENCE ON ADVANCES IN ENERGY, ENVIRONMENT AND CHEMICAL ENGINEERING, 2017, 69
  • [9] Speech enhancement based on noise classification and deep neural network
    Wang, Wenbo
    Liu, Houguang
    Yang, Jianhua
    Cao, Guohua
    Hua, Chunli
    MODERN PHYSICS LETTERS B, 2019, 33 (17):
  • [10] ADAPTIVE MULTIRESOLUTION SPEECH ENHANCEMENT ALGORITHM BASED ON WAVELET TRANSFORM
    Zheng Yuanjin Li Lemin Wen Maosheng (National Key Lab. of Optical Fiber Communication
    JournalofElectronics(China), 1999, (02) : 97 - 103