Robust Keyword Spotting for Noisy Environments by Leveraging Speech Enhancement and Speech Presence Probability

被引:1
|
作者
Yang, Chouchang [1 ]
Saidutta, Yashas Malur [1 ]
Srinivasa, Rakshith Sharma [1 ]
Lee, Ching-Hua [1 ]
Shen, Yilin [1 ]
Jin, Hongxia [1 ]
机构
[1] Samsung Res Amer, Mountain View, CA 94043 USA
来源
关键词
keyword spotting; speech commands; speech presence probability; noise robust; speech enhancement;
D O I
10.21437/Interspeech.2023-2222
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although various deep keyword spotting (KWS) systems have demonstrated promising performance under relatively noiseless environments, accurate keyword detection in the presence of strong noise remains challenging. Room acoustics and noise conditions can be highly diverse, leading to drastic performance degradation if not handled carefully. In this paper, we propose a noise management front-end called SE-SPP Net performing speech enhancement (SE) and speech presence probability (SPP) estimation jointly for robust KWS in noise. The SE-SPP Net estimates both the denoised Mel spectrogram and the position of the speech utterance in the noisy signal, where the latter is estimated as the probability of a particular time-frequency bin containing speech. Further, it comes at relatively no cost in model size when compared to a model estimating the denoised speech. Our SE-SPP Net can improve noisy KWS performance by up to 7% compared to a similar sized state-of-the-art model at SNR -10dB.
引用
收藏
页码:1638 / 1642
页数:5
相关论文
共 50 条
  • [1] A robust speech enhancement method in noisy environments
    Abajaddi, Nesrine
    Mounir, Badia
    Elfahm, Youssef
    Farchi, Abdelmajid
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (09) : 973 - 983
  • [2] White Listing and Score Normalization for Keyword Spotting of Noisy Speech
    Zhang, Bing
    Schwartz, Richard
    Tsakalidis, Stavros
    Long Nguyen
    Matsoukas, Spyros
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1830 - 1833
  • [3] A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments
    Visser, E
    Otsuka, M
    Lee, TW
    SPEECH COMMUNICATION, 2003, 41 (2-3) : 393 - 407
  • [4] Speech enhancement applied to speech recognition in noisy environments
    Xu, Y.F., 2001, Press of Tsinghua University (41):
  • [5] Robust recognition of noisy speech using speech enhancement
    Xu, YF
    Zhang, JJ
    Yao, KS
    Cao, ZG
    Ma, ZX
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 734 - 737
  • [6] Robust Speech Detection for Noisy Environments
    Varela, Oscar
    Indra, S. A.
    San-Segundo, Ruben
    Hernandez, Luis A.
    IEEE AEROSPACE AND ELECTRONIC SYSTEMS MAGAZINE, 2011, 26 (11) : 16 - U12
  • [7] Speech Synthesis enhancement in noisy environments
    Bonardo, Davide
    Zovato, Enrico
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 789 - 792
  • [8] Keyword-dependent monaural speech enhancement for open-vocabulary keyword spotting
    Liu, Zuozhen
    Wu, Chou
    Li, Ta
    Zhao, Qingwei
    Shengxue Xuebao/Acta Acustica, 2023, 48 (02): : 415 - 424
  • [9] Enhancement of Reverberant Speech in Noisy Acoustical Environments
    Joorabchi, Marjan
    Ghorshi, Seyed
    Sarafnia, Ali
    2014 SIXTH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2014,
  • [10] Fast Keyword Spotting in Telephone Speech
    Nouza, Jan
    Silovsky, Jan
    RADIOENGINEERING, 2009, 18 (04) : 665 - 670