PAN: PHONEME-AWARE NETWORK FOR MONAURAL SPEECH ENHANCEMENT

被引:0
|
作者
Du, Zhihao [1 ]
Lei, Ming [2 ]
Han, Jiqing [1 ]
Zhang, Shiliang [2 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China
[2] Alibaba Grp, Machine Intelligence Technol, Hangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Monaural speech enhancement; phonetic posteriorgram; phoneme-aware network;
D O I
10.1109/icassp40776.2020.9054334
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Current methods for monaural speech enhancement only utilize acoustic information but seldom consider the phonetic information of an utterance. In the voice conversion community, significant progress has been achieved by using the phonetic information via the phonetic posteriorgrams (PPGs). Inspired by the progress, we propose a phoneme-aware network (PAN) to utilize the noisy PPGs for speech enhancement. Since the PPG prediction and speech enhancement benefit from each other, a PPG predictor is involved into the PAN and an iterative training algorithm is proposed for PAN. Experimental results show that the enhancement performance is improved by using the phonetic information in terms of speech intelligibility, perceptual quality and character error rate. To the best of our knowledge, this is the first time to introduce the PPG into speech enhancement.
引用
收藏
页码:6634 / 6638
页数:5
相关论文
共 50 条
  • [31] GAN-in-GAN for Monaural Speech Enhancement
    Duan, Yicun
    Ren, Jianfeng
    Yu, Heng
    Jiang, Xudong
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 853 - 857
  • [32] Monaural speech enhancement based on periodicity analysis
    Chen, Z.
    Hohmann, V
    BIOMEDICAL ENGINEERING-BIOMEDIZINISCHE TECHNIK, 2014, 59 : S736 - S736
  • [33] PFRNet: Dual-Branch Progressive Fusion Rectification Network for Monaural Speech Enhancement
    Yu, Runxiang
    Zhao, Ziwei
    Ye, Zhongfu
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2358 - 2362
  • [34] Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis
    Peter Ochieng
    Artificial Intelligence Review, 2023, 56 : 3651 - 3703
  • [35] Real and imaginary part interaction network for monaural speech enhancement and de-reverberation
    Zhang, Zehua
    He, Changjun
    Xu, Shiyun
    Wang, Mingjiang
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 972 - 977
  • [36] Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement
    Zhang, Zehua
    Zhang, Lu
    Zhuang, Xuyi
    Qian, Yukun
    Wang, Mingjiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
  • [37] Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis
    Ochieng, Peter
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL3) : S3651 - S3703
  • [38] Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability
    Shi, Wenhua
    Zhang, Xiongwei
    Zou, Xia
    Sun, Meng
    Han, Wei
    Li, Li
    Min, Gang
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (03) : 585 - 589
  • [39] Phoneme Aware Speech Recognition through Evolutionary Optimisation
    Bird, Jordan J.
    Wanner, Elizabeth
    Ekart, Aniko
    Faria, Diego R.
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 362 - 363
  • [40] Phoneme-Aware Adaptation with Discrepancy Minimization and Dynamically-Classified Vector for Text-independent Speaker Verification
    Wang, Jia
    Lan, Tianhao
    Chen, Jie
    Luo, Chengwen
    Wu, Chao
    Li, Jianqiang
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6737 - 6745