DNN TRAINING BASED ON CLASSIC GAIN FUNCTION FOR SINGLE-CHANNEL SPEECH ENHANCEMENT AND RECOGNITION

被引:0
|
作者
Tu, Yan-Hui [1 ]
Du, Jun [1 ]
Lee, Chin-Hui [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
statistical speech enhancement; ideal ratio mask; deep learning; gain function; speech recognition; NOISE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
For conventional single-channel speech enhancement based on noise power spectrum, the speech gain function, which suppresses background noise at each time-frequency bin, is calculated by prior signal-to-noise-ratio (SNR). Hence, accurate prior SNR estimation is paramount for successful noise suppression. Accordingly, we have proposed a single-channel approach to combine conventional and deep learning techniques for speech enhancement and automatic speech recognition (ASR) recently. However, the combination process is at the testing stage, which is time-consuming with a complicated procedure. In this study, the gain function of classic speech enhancement will be utilized to optimize the ideal ratio mask based deep neural network (DNN-IRM) at the training stage, denoted as GF-DNN-IRM. And at the testing stage, the estimated IRM by GF-DNN-IRM model is directly used to generate enhanced speech without involving the conventional speech enhancement process. In addition, DNNs with less parameters in the causal processing mode are also discussed. Experiments of the CHiME-4 challenge task show that our proposed algorithm can achieve a relative word error rate reduction of 6.57% on RealData test set comparing to unprocessed speech without acoustic model retraining in causal mode, while the traditional DNN-IRM method fails to improve ASR performance in this case.
引用
收藏
页码:910 / 914
页数:5
相关论文
共 50 条
  • [1] SINGLE-CHANNEL SPEECH ENHANCEMENT WITH SEQUENTIALLY TRAINED DNN SYSTEM
    Sun, Yang
    Xian, Yang
    Wang, Wenwu
    Naqvi, Syed Mohsen
    2019 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2019,
  • [2] Single-Channel Speech Enhancement Techniques for Distant Speech Recognition
    Ashwini, Jaya
    Kumaraswamy, Ramaswamy
    JOURNAL OF INTELLIGENT SYSTEMS, 2013, 22 (02) : 81 - 93
  • [3] INVESTIGATION OF A PARAMETRIC GAIN APPROACH TO SINGLE-CHANNEL SPEECH ENHANCEMENT
    Huang, Gongping
    Chen, Jingdong
    Benesty, Jacob
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 206 - 210
  • [4] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
    Taherian, Hassan
    Wang, Zhong-Qiu
    Chang, Jorge
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
  • [5] Joint Optimization of Perceptual Gain Function and Deep Neural Networks for Single-Channel Speech Enhancement
    Han, Wei
    Zhang, Xiongwei
    Min, Gang
    Zhou, Xingyu
    Sun, Meng
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (02) : 714 - 717
  • [6] A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target
    Wang, Lei
    Zhu, Jie
    Sun, Kangbo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (11) : 1963 - 1970
  • [7] Single-Channel Speech Enhancement Based on Psychoacoustic Masking
    Zhou, Tingting
    Zeng, Yumin
    Wang, Rongrong
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2017, 65 (04): : 272 - 284
  • [8] Single-Channel Multitalker Speech Recognition
    Rennie, Steven J.
    Hershey, John R.
    Olsen, Peder A.
    IEEE SIGNAL PROCESSING MAGAZINE, 2010, 27 (06) : 66 - 80
  • [9] Weak Speech Recovery for Single-Channel Speech Enhancement
    Wong, Arthur
    Ming, Kok
    Low, Siow Yong
    2012 4TH INTERNATIONAL CONFERENCE ON INTELLIGENT AND ADVANCED SYSTEMS (ICIAS), VOLS 1-2, 2012, : 627 - 631
  • [10] Single-channel speech enhancement based on frequency domain ALE
    Nakanishi, Isao
    Nagata, Yuudai
    Itoh, Yoshio
    Fukui, Yutaka
    2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 2541 - 2544