Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

被引：2

作者：

Le, Xiaohuai ^{[1
,2
]}

Lei, Tong ^{[1
,3
]}

Chen, Li ^{[2
]}

Guo, Yiqing ^{[2
]}

He, Chao ^{[2
]}

Chen, Cheng ^{[2
]}

Xia, Xianjun ^{[2
]}

Gao, Hua ^{[2
]}

Xiao, Yijian ^{[2
]}

Ding, Piao ^{[2
]}

Song, Shenyi ^{[2
]}

Lu, Jing ^{[1
,3
]}

机构：

[1] Nanjing Univ, Key Lab Modern Acoust, Nanjing 210093, Peoples R China

[2] ByteDance, RTC Lab, Beijing, Peoples R China

[3] Horizon Robot, NJU Horizon Intelligent Audio Lab, Beijing 100094, Peoples R China

来源：

INTERSPEECH 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

Comb filter; Speech enhancement; PercepNet; DeepFilterNet; NETWORKS;

D O I：

10.21437/Interspeech.2023-186

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccurate fundamental frequency estimation. To tackle this problem, we propose a learnable comb filter to enhance harmonics. Based on the sub-band model, we design a DNN-based fundamental frequency estimator to estimate the discrete fundamental frequencies and a comb filter for harmonic enhancement, which are trained via an end-to-end pattern. The experiments show the advantages of our proposed method over PecepNet and DeepFilterNet.

引用

页码：3894 / 3898

页数：5

共 36 条

[1] Learnable spectral dimension compression mapping for full-band speech enhancement
Hu, Qinwen
Hou, Zhongshu
Chen, Kai
Lu, Jing
JASA EXPRESS LETTERS, 2023, 3 (02):
[2] Local spectral attention for full-band speech enhancement
Hou, Zhongshu
Hu, Qinwen
Chen, Kai
Cao, Zhanzhong
Lu, Jing
JASA EXPRESS LETTERS, 2023, 3 (11):
[3] Analysis and Synthesis of Speech Using an Adaptive Full-Band Harmonic Model
Degottex, Gilles
Stylianou, Yannis
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2085 - 2095
[4] Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Full-Band Speech Enhancement
Yu, Guochen
Li, Andong
Liu, Wenzhe
Zheng, Chengshi
Wang, Yutian
Wang, Hui
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 483 - 487
[5] Harmonic enhancement of speech signal using comb filtering
Cai, Yu
Yuan, Jianping
Hou, Chaohuan
Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2010, 31 (01): : 26 - 31
[6] Lightweight Full-band and Sub-band Fusion Network for Real Time Speech Enhancement
Chen, Zhuangqi
Zhang, Pingjian
INTERSPEECH 2022, 2022, : 921 - 925
[7] Speech Enhancement Using Harmonic Emphasis and Adaptive Comb Filtering
Jin, Wen
Liu, Xin
Scordilis, Michael S.
Han, Lu
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (02): : 356 - 368
[8] DMF-Net: A decoupling-style multi-band fusion model for full-band speech enhancement
Yu, Guochen
Guan, Yuansheng
Meng, Weixin
Zheng, Chengshi
Wang, Hui
Wang, Yutian
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1382 - 1387
[9] On the Use of Absolute Threshold of Hearing-based Loss for Full-band Speech Enhancement
Mars, Rohith
Das, Rohan Kumar
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 458 - 462
[10] FULLSUBNET: A FULL-BAND AND SUB-BAND FUSION MODEL FOR REAL-TIME SINGLE-CHANNEL SPEECH ENHANCEMENT
Hao, Xiang
Su, Xiangdong
Horaud, Radu
Li, Xiaofei
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6633 - 6637

← 1 2 3 4 →