Modelling non-stationary noise with spectral factorisation in automatic speech recognition

被引:16
|
作者
Hurmalainen, Antti [1 ]
Gemmeke, Jort F. [2 ]
Virtanen, Tuomas [1 ]
机构
[1] Tampere Univ Technol, Dept Signal Proc, FI-33101 Tampere, Finland
[2] Katholieke Univ Leuven, Dept ESAT PSI, B-3001 Louvain, Belgium
来源
COMPUTER SPEECH AND LANGUAGE | 2013年 / 27卷 / 03期
基金
芬兰科学院;
关键词
Automatic speech recognition; Noise robustness; Non-stationary noise; Non-negative spectral factorisation; Exemplar-based; NONNEGATIVE MATRIX FACTORIZATION; SEPARATION; ALGORITHMS;
D O I
10.1016/j.csl.2012.07.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech recognition systems intended for everyday use must be able to cope with a large variety of noise types and levels, including highly non-stationary multi-source mixtures. This study applies spectral factorisation algorithms and long temporal context for separating speech and noise from mixed signals. To adapt the system to varying environments, noise models are acquired from the context, or learnt from the mixture itself without prior information. We also propose methods for reducing the size of the bases used for speech and noise modelling by 20-40 times for better practical applicability. We evaluate the performance of the methods both as a standalone classifier and as a signal-enhancing front-end for external recognisers. For the CHiME noisy speech corpus containing non-stationary multi-source household noises at signal-to-noise ratios ranging from +9 to -6 dB, we report average keyword recognition rates up to 87.8% using a single-stream sparse classification algorithm. (c) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:763 / 779
页数:17
相关论文
共 50 条
  • [41] Hidden Markov models with templates as non-stationary states: An application to speech recognition
    Ghitza, Oded
    Sondhi, M.Mohan
    Computer Speech and Language, 1993, 7 (02): : 101 - 119
  • [42] Non-stationary correlation matrices and noise
    Martins, Andre C. R.
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2007, 379 (02) : 552 - 558
  • [43] The phase space of non-stationary noise
    Galleani, L
    Cohen, L
    JOURNAL OF MODERN OPTICS, 2004, 51 (16-18) : 2731 - 2740
  • [44] Modelling non-stationary 'Big Data'
    Castle, Jennifer L.
    Doornik, Jurgen A.
    Hendry, David F.
    INTERNATIONAL JOURNAL OF FORECASTING, 2021, 37 (04) : 1556 - 1575
  • [45] DETECTION OF A NON-STATIONARY SIGNAL IN NOISE
    MCNEIL, DR
    AUSTRALIAN JOURNAL OF PHYSICS, 1967, 20 (03): : 325 - +
  • [46] NON-STATIONARY NOISE POWER SPECTRAL DENSITY ESTIMATION BASED ON REGIONAL STATISTICS
    Li, Xiaofei
    Girin, Laurent
    Gannot, Sharon
    Horaud, Radu
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 181 - 185
  • [47] Stationary and non-stationary noise in superconducting quantum devices
    Martin, I.
    Bulaevskii, L.
    Shnirman, A.
    Galperin, Y. M.
    NOISE AND FLUCTUATIONS IN CIRCUITS, DEVICES, AND MATERIALS, 2007, 6600
  • [48] Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments
    Duan, Zhiyao
    Mysore, Gautham J.
    Smaragdis, Paris
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 594 - 597
  • [49] Continuous speech recognition under non-stationary musical environments based on speech state transition model
    Fujimoto, M
    Ariki, Y
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 297 - 300
  • [50] Dynamic adjustment of the forgetting factor in adaptive filters for non-stationary noise cancellation in speech
    Martinez, R
    Gomez, P
    Alvarez, A
    Nieto, V
    Rodellar, V
    Rubio, M
    Perez, M
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 1009 - 1012