PAN: PHONEME-AWARE NETWORK FOR MONAURAL SPEECH ENHANCEMENT

被引：0

作者：

Du, Zhihao ^{[1
]}

Lei, Ming ^{[2
]}

Han, Jiqing ^{[1
]}

Zhang, Shiliang ^{[2
]}

机构：

[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China

[2] Alibaba Grp, Machine Intelligence Technol, Hangzhou, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

中国国家自然科学基金;

关键词：

Monaural speech enhancement; phonetic posteriorgram; phoneme-aware network;

D O I：

10.1109/icassp40776.2020.9054334

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Current methods for monaural speech enhancement only utilize acoustic information but seldom consider the phonetic information of an utterance. In the voice conversion community, significant progress has been achieved by using the phonetic information via the phonetic posteriorgrams (PPGs). Inspired by the progress, we propose a phoneme-aware network (PAN) to utilize the noisy PPGs for speech enhancement. Since the PPG prediction and speech enhancement benefit from each other, a PPG predictor is involved into the PAN and an iterative training algorithm is proposed for PAN. Experimental results show that the enhancement performance is improved by using the phonetic information in terms of speech intelligibility, perceptual quality and character error rate. To the best of our knowledge, this is the first time to introduce the PPG into speech enhancement.

引用

页码：6634 / 6638

页数：5

共 50 条

[31] GAN-in-GAN for Monaural Speech Enhancement
Duan, Yicun
Ren, Jianfeng
Yu, Heng
Jiang, Xudong
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 853 - 857
[32] Monaural speech enhancement based on periodicity analysis
Chen, Z.
Hohmann, V
BIOMEDICAL ENGINEERING-BIOMEDIZINISCHE TECHNIK, 2014, 59 : S736 - S736
[33] PFRNet: Dual-Branch Progressive Fusion Rectification Network for Monaural Speech Enhancement
Yu, Runxiang
Zhao, Ziwei
Ye, Zhongfu
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2358 - 2362
[34] Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis
Peter Ochieng
Artificial Intelligence Review, 2023, 56 : 3651 - 3703
[35] Real and imaginary part interaction network for monaural speech enhancement and de-reverberation
Zhang, Zehua
He, Changjun
Xu, Shiyun
Wang, Mingjiang
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 972 - 977
[36] Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement
Zhang, Zehua
Zhang, Lu
Zhuang, Xuyi
Qian, Yukun
Wang, Mingjiang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
[37] Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis
Ochieng, Peter
ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL3) : S3651 - S3703
[38] Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability
Shi, Wenhua
Zhang, Xiongwei
Zou, Xia
Sun, Meng
Han, Wei
Li, Li
Min, Gang
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (03) : 585 - 589
[39] Phoneme Aware Speech Recognition through Evolutionary Optimisation
Bird, Jordan J.
Wanner, Elizabeth
Ekart, Aniko
Faria, Diego R.
PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 362 - 363
[40] Phoneme-Aware Adaptation with Discrepancy Minimization and Dynamically-Classified Vector for Text-independent Speaker Verification
Wang, Jia
Lan, Tianhao
Chen, Jie
Luo, Chengwen
Wu, Chao
Li, Jianqiang
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6737 - 6745

← 1 2 3 4 5 →