Improved robustness of noisy speech HMMs based on weighted variance expansion

被引:0
|
作者
Kanno, S [1 ]
Funada, T [1 ]
机构
[1] Kanazawa Univ, Ind Res Inst Ishikawa, Kanazawa, Ishikawa 9200223, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spectrum of noise and SNR often vary abruptly due to the non-stationary noise under field conditions. The performance of speech recognition degrades rapidly when the noise conditions in the recognition process are different from those in the process of training or adaptation, therefore it is necessary to make HMMs robust to abrupt variation of noise. In this paper, we propose a method to modify the output probability at the state sensitive to noise by using weighted variance expansion based on the power of state or probability distribution, in order to improve the performance. The effectiveness of this method was examined in two types of noisy speech HMMs (one was trained with a specific SNR. the other was trained with five kinds of SNRs), through the evaluation experiments of speaker independent word recognition using noises of two factories. As the results, this method improved the robustness of the HMMs against the variation of noise conditions (noise type and SNR).
引用
收藏
页码:556 / 559
页数:4
相关论文
共 50 条
  • [21] Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition
    Kumar, N
    Andreou, AG
    SPEECH COMMUNICATION, 1998, 26 (04) : 283 - 297
  • [22] Model Adaptation Based on Improved Variance Estimation for Robust Speech Recognition
    Lu, Yong
    Xu, Zongyu
    Yan, Qin
    Zhou, Lin
    2012 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP 2012), 2012,
  • [23] A frequency-weighted HMM based on minimum error classification for noisy speech recognition
    Matsumoto, H
    Ono, M
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1511 - 1514
  • [24] Formant tracking linear prediction model using HMMs and Kalman filters for noisy speech processing
    Yan, Qin
    Vaseghi, Saeed
    Zavarehei, Esfandiar
    Milner, Ben
    Darch, Jonathan
    White, Paul
    Andrianakis, Ioannis
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (03): : 543 - 561
  • [25] Speech-to-face movement synthesis based on HMMs
    Kakihara, K
    Nakamura, S
    Shikano, K
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 427 - 430
  • [26] NONUNIFORM UNIT BASED HMMS FOR CONTINUOUS SPEECH RECOGNITION
    MATSUMURA, T
    MATSUNAGA, S
    SPEECH COMMUNICATION, 1995, 17 (3-4) : 321 - 329
  • [27] Noisy speech recognition based on speech enhancement
    Wang, Xia
    Tang, Hongmei
    Zhao, Xiaoqun
    SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 713 - +
  • [28] Study of Speech Features Robustness for Speaker Verification Application in Noisy Environments
    Mohammadi, Mohsen
    Sadegh Mohammadi, Hamid Reza
    2016 8TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2016, : 489 - 493
  • [29] Using Noisy Speech to Study the Robustness of a Continuous F0 Modelling Method in HMM-based Speech Synthesis
    Ogbureke, Kalu U.
    Cabral, Joao P.
    Carson-Berndsen, Julie
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 67 - 70
  • [30] Exploring the Robustness of Text-to-Speech Synthesis Based on Diffusion Probabilistic Models to Heavily Noisy Transcriptions
    Feng, Jingyi
    Yasuda, Yusuke
    Toda, Tomoki
    INTERSPEECH 2024, 2024, : 4408 - 4412