Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

被引:2
|
作者
Le, Xiaohuai [1 ,2 ]
Lei, Tong [1 ,3 ]
Chen, Li [2 ]
Guo, Yiqing [2 ]
He, Chao [2 ]
Chen, Cheng [2 ]
Xia, Xianjun [2 ]
Gao, Hua [2 ]
Xiao, Yijian [2 ]
Ding, Piao [2 ]
Song, Shenyi [2 ]
Lu, Jing [1 ,3 ]
机构
[1] Nanjing Univ, Key Lab Modern Acoust, Nanjing 210093, Peoples R China
[2] ByteDance, RTC Lab, Beijing, Peoples R China
[3] Horizon Robot, NJU Horizon Intelligent Audio Lab, Beijing 100094, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Comb filter; Speech enhancement; PercepNet; DeepFilterNet; NETWORKS;
D O I
10.21437/Interspeech.2023-186
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccurate fundamental frequency estimation. To tackle this problem, we propose a learnable comb filter to enhance harmonics. Based on the sub-band model, we design a DNN-based fundamental frequency estimator to estimate the discrete fundamental frequencies and a comb filter for harmonic enhancement, which are trained via an end-to-end pattern. The experiments show the advantages of our proposed method over PecepNet and DeepFilterNet.
引用
收藏
页码:3894 / 3898
页数:5
相关论文
共 36 条
  • [21] Speech enhancement using sub-band cross-correlation compensated Wiener filter combined with harmonic regeneration
    Rao, Ch. V. Rama
    Murthy, M. B. Rama
    Rao, K. Srinivasa
    AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2012, 66 (06) : 459 - 464
  • [22] DPT-FSNET: DUAL-PATH TRANSFORMER BASED FULL-BAND AND SUB-BAND FUSION NETWORK FOR SPEECH ENHANCEMENT
    Dang, Feng
    Chen, Hangting
    Zhangt, Pengyuan
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6857 - 6861
  • [23] Fast Enhancement for Non-Uniform Illumination Images using Light-weight CNNs
    Lv, Feifan
    Liu, Bo
    Lu, Feng
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1450 - 1458
  • [24] Speech quality enhancement based on sinusoidal model using Chebyshev filter
    Kim, Kiliong
    Chung, Yongick
    Park, Cheolyong
    Son, Youngho
    Yoon, Janghong
    PROCEEDINGS OF FUTURE GENERATION COMMUNICATION AND NETWORKING, MAIN CONFERENCE PAPERS, VOL 1, 2007, : 322 - 326
  • [25] Enhancement of Noisy Speech using Sub-band Harmonic Regeneration and Speech Presence Uncertainty Estimator
    Kumar, Ravi
    Subbaiah, P. V.
    2016 IEEE INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ELECTRONICS, INFORMATION & COMMUNICATION TECHNOLOGY (RTEICT), 2016, : 456 - 460
  • [26] Speech Enhancement using Sub-band Wiener Filter with Pitch Synchronous Analysis
    Sunnydayal, V.
    Kumar, T. Kishore
    2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 20 - 25
  • [27] TS-CGANet: A Two-Stage Complex and Real Dual-Path Sub-Band Fusion Network for Full-Band Speech Enhancement
    Chen, Haozhe
    Zhang, Xiaojuan
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [28] Speech Enhancement Using Wavelet Neural Network with Sub-Band Adaptive Matched Filter
    Yang, Dan
    Xu, Bin
    Ye, Linlin
    Wang, Xu
    MECHATRONICS AND INFORMATION TECHNOLOGY, PTS 1 AND 2, 2012, 2-3 : 127 - 130
  • [29] WEIGHTED CODEBOOK MAPPING FOR NOISY SPEECH ENHANCEMENT USING HARMONIC-NOISE MODEL
    Zavarehei, Esfandiar
    Vaseghi, Saeed
    Yan, Qin
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 253 - 256
  • [30] FB-MSTCN: A FULL-BAND SINGLE-CHANNEL SPEECH ENHANCEMENT METHOD BASED ON MULTI-SCALE TEMPORAL CONVOLUTIONAL NETWORK
    Zhang, Zehua
    Zhang, Lu
    Zhuang, Xuyi
    Qian, Yukun
    Li, Heng
    Wang, Mingjiang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9276 - 9280