Speech Enhancement using Convolution Neural Network-based Spectrogram Denoising

被引:0
|
作者
Hu Xuhong [1 ]
Yan Lin-Huang [2 ]
Lu Xun [3 ]
Guan Yuan-Sheng [2 ]
Hu Wenlin [1 ]
Wang Jie [2 ,4 ]
机构
[1] China Railway Design Corp, Natl Engn Lab Digital Construct & Evaluat Urban R, Tianjin, Peoples R China
[2] Guangzhou Univ, Sch Elect & Commun Engn, Guangzhou, Guangdong, Peoples R China
[3] Guangdong Power Grid Co, Power Grid Planning Ctr, Guangzhou, Guangdong, Peoples R China
[4] Ctr Rd Traff Noise Control, Natl Environm Protect Engn & Technol, Beijing, Peoples R China
关键词
Speech enhancement; deep learning; convolution neural network; spectrogram denoising; NOISE; EFFICIENT;
D O I
10.1109/CMMNO53328.2021.9467599
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
Regarding spectrogram as an image, this paper adopts a convolution neural network (CNN)-based image enhancement algorithm for spectrogram denoising. By doing so, speech denoising can be achieved when the spectrogram is enhanced by the proposed CNN-based image enhancement algorithm. The spectrogram clipping strategy was presented to obtain a large amount of training data, which gave rise to a smaller storage cost and avoided the limited depth development and problem of excessive complexity commonly presented in traditional speech features when training a recurrent neural network. Meanwhile, a deeper network was constructed to improve the capacity and flexibility to use the features of the spectrogram better, and it can also capture enough spatial information to make the noise reduction performance effectively. In addition, the proposed model utilized residual learning strategy in CNN training, with the combination of batch normalization, which greatly improved the performance of the model. The experimental results demonstrates that the proposed spectrogram denoising model has better learning ability and denoising performance, whether it is a known noise situation or a noise mismatch situation, so that the proposed system shows robust speech enhancement effect.
引用
收藏
页码:310 / 318
页数:9
相关论文
共 50 条
  • [21] MFF-SAug: Multi feature fusion with spectrogram augmentation of speech emotion recognition using convolution neural network
    Jothimani, S.
    Premalatha, K.
    CHAOS SOLITONS & FRACTALS, 2022, 162
  • [22] Convolutional neural network-based fracture detection in spectrogram of acoustic emission
    Monika, R.
    Deivalakshmi, S.
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (05) : 4059 - 4074
  • [23] Image denoising method based on a deep convolution neural network
    Zhang, Fu
    Cai, Nian
    Wu, Jixiu
    Cen, Guandong
    Wang, Han
    Chen, Xindu
    IET IMAGE PROCESSING, 2018, 12 (04) : 485 - 493
  • [24] Simultaneous denoising and spatial resolution enhancement using convolutional neural network-based linear model in diagnostic CT images
    Yim, Dobin
    Kim, Burnyoung
    Lee, Seungwan
    MEDICAL IMAGING 2020: PHYSICS OF MEDICAL IMAGING, 2020, 11312
  • [25] Neural network-based adaptive noise cancellation for enhancement of speech auditory brainstem responses
    Shiva Gholami-Boroujeny
    Anwar Fallatah
    Brian P. Heffernan
    Hilmi R. Dajani
    Signal, Image and Video Processing, 2016, 10 : 389 - 395
  • [26] Low-dimensional recurrent neural network-based Kalman filter for speech enhancement
    Xia, Youshen
    Wang, Jun
    NEURAL NETWORKS, 2015, 67 : 131 - 139
  • [27] Neural network-based adaptive noise cancellation for enhancement of speech auditory brainstem responses
    Gholami-Boroujeny, Shiva
    Fallatah, Anwar
    Heffernan, Brian P.
    Dajani, Hilmi R.
    SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (02) : 389 - 395
  • [28] Image denoising with a convolution neural network using Gaussian filtered residuals
    Mohan L.
    Veeramani V.
    IEIE Transactions on Smart Processing and Computing, 2021, 10 (02): : 96 - 100
  • [29] Denoising in the Dark: Privacy-Preserving Deep Neural Network-Based Image Denoising
    Zheng, Yifeng
    Duan, Huayi
    Tang, Xiaoting
    Wang, Cong
    Zhou, Jiantao
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (03) : 1261 - 1275
  • [30] Style Transplantation in Neural Network-based Speech Synthesis
    Suzic, Sinisa B.
    Delic, Tijana, V
    Pekar, Darko J.
    Delic, Vlado D.
    Secujski, Milan S.
    ACTA POLYTECHNICA HUNGARICA, 2019, 16 (06) : 171 - 189