Spectrum enhancement with sparse coding for robust speech recognition

被引:11
|
作者
He, Yongjun [1 ]
Sun, Guanglu [1 ]
Han, Jiqing [2 ]
机构
[1] Harbin Univ Sci & Technol, Harbin 150080, Peoples R China
[2] Harbin Inst Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Sparse coding; Speech denoising; Residual noise; Basis pursuit denoising; JOINT COMPENSATION; REPRESENTATION; NOISE; ADAPTATION; REGRESSION; EQUATIONS; FEATURES; SYSTEMS;
D O I
10.1016/j.dsp.2015.04.014
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, a trend in speech recognition is to introduce sparse coding for noise robustness. Although several methods have been proposed, the performance of sparse coding in speech denoising is not so optimistic. One assumption with sparse coding is that the representation of speech over the speech dictionary is sparse, while that of the noise is dense. This assumption is obviously not sustained in the speech denoising scenario. Many noises are also sparse over the speech dictionary. In such a condition, the representation of noisy speech still contains noise components, resulting in degraded performance. To solve this problem, we first analyze the assumption of sparse coding and then propose a novel method to enhance speech spectrum. This method first finds out the atoms which represent the noise sparsely, and then selectively ignores them in the reconstruction of speech to reduce the residual noise. Speech features are then extracted from the enhanced spectrum for speech recognition. Experimental results show that the proposed method can improve the noise robustness of a speech recognition system substantially. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:59 / 70
页数:12
相关论文
共 50 条
  • [41] Speech enhancement with a GSC-like structure employing sparse coding
    Li-chun YANG
    Yun-tao QIAN
    Frontiers of Information Technology & Electronic Engineering, 2014, (12) : 1154 - 1163
  • [42] Speech enhancement with a GSC-like structure employing sparse coding
    Li-chun Yang
    Yun-tao Qian
    Journal of Zhejiang University SCIENCE C, 2014, 15 : 1154 - 1163
  • [43] Threshold reduction for improving sparse coding shrinkage performance in speech enhancement
    Faraji, Neda
    Ahadi, S. M.
    Shariati, S. Saloomeh
    2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 1220 - +
  • [44] On properties of modulation spectrum for robust automatic speech recognition
    Kanedera, N
    Hermansky, H
    Arai, T
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 613 - 616
  • [45] Modulation Spectrum Equalization for Improved Robust Speech Recognition
    Sun, Liang-Che
    Lee, Lin-Shan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03): : 828 - 843
  • [46] Modulation spectrum exponential weighting for robust speech recognition
    Fan, Hao-teng
    Lian, Yi-cheng
    Hung, Jeih-weih
    2012 12TH INTERNATIONAL CONFERENCE ON ITS TELECOMMUNICATIONS (ITST-2012), 2012, : 812 - 816
  • [47] A Robust Pansharpening Algorithm Based on Convolutional Sparse Coding for Spatial Enhancement
    Gogineni, Rajesh
    Chaturvedi, Ashvini
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2019, 12 (10) : 4024 - 4037
  • [48] Exploring Feature Enhancement in The Modulation Spectrum Domain via Ideal Ratio Mask for Robust Speech Recognition
    Yan, Bi-Cheng
    Wu, Meng-Che
    Chen, Berlin
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 759 - 763
  • [49] On robust face recognition via sparse coding: the good, the bad and the ugly
    Wong, Yongkang
    Harandi, Mehrtash T.
    Sanderson, Conrad
    IET BIOMETRICS, 2014, 3 (04) : 176 - 189
  • [50] ROBUST FACE RECOGNITION BASED ON ITERATIVE SPARSE CODING AND PIXEL SELECTION
    Lian, Lina
    Zheng, Huicheng
    Dong, Jiayu
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1782 - 1786