Investigations of Issues for Using Multiple Acoustic Models to Improve Continuous Speech Recognition

被引:0
|
作者
Zhang, Rong [1 ]
Rudnicky, Alexander I. [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
Boosting; ROVER;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates two important issues in constructing and combining ensembles of acoustic models for reducing recognition errors. First, we investigate the applicability of the AnyBoost algorithm for acoustic model training. AnyBoost is a generalized Boosting method that allows the use of an arbitrary loss function as the training criterion to construct ensemble of classifiers. We choose the MCE discriminative objective function for our experiments. Initial test results on a real-world meeting recognition corpus show that AnyBoost is a competitive alternate to the standard AdaBoost algorithm. Second, we investigate ROVER-based combination, focusing on the technique for selecting correct hypothesized words from aligned WTN. We propose a neural network based insertion detection and word scoring scheme for this. Our approach consistently outperforms the current voting technique used by ROVER in the experiments.
引用
收藏
页码:529 / 532
页数:4
相关论文
共 50 条
  • [21] Integration of multiple acoustic and language models for improved Hindi speech recognition system
    R. K. Aggarwal
    M. Dave
    Aggarwal, R.K. (rka15969@gmail.com), 2012, Kluwer Academic Publishers (15) : 165 - 180
  • [22] Improving Discriminative Training for Robust Acoustic Models in Large Vocabulary Continuous Speech Recognition
    Pylkkonen, Janne
    Kurimo, Mikko
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1210 - 1213
  • [23] HYBRID DNN-LATENT STRUCTURED SVM ACOUSTIC MODELS FOR CONTINUOUS SPEECH RECOGNITION
    Ravuri, Suman
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 37 - 44
  • [24] Boosting Thai Syllable Speech Recognition Using Acoustic Models Combination
    Tangwongsan, Supachai
    Phoophuangpairoj, Rong
    ICCEE 2008: PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, 2008, : 568 - 572
  • [25] Emotional Speech Recognition Using Acoustic Models of Decomposed Component Words
    Kaveeta, Vivatchai
    Patanukhom, Karn
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 115 - 119
  • [26] Speech recognition using voice-characteristic-dependent acoustic models
    Suzuki, H
    Zen, H
    Nankaku, Y
    Miyajima, C
    Tokuda, K
    Kitamura, T
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 740 - 743
  • [27] Multilingual acoustic models for speech recognition and synthesis
    Kunzmann, S
    Fischer, V
    Gonzalez, J
    Emam, O
    Günther, C
    Janke, E
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 745 - 748
  • [28] Dynamically configurable acoustic models for speech recognition
    Hwang, MY
    Huang, XD
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 669 - 672
  • [29] Acoustic-to-Phrase Models for Speech Recognition
    Gaur, Yashesh
    Li, Jinyu
    Meng, Zhong
    Gong, Yifan
    INTERSPEECH 2019, 2019, : 2240 - 2244
  • [30] Compact Acoustic Models for Embedded Speech Recognition
    Levy, Christophe
    Linares, Georges
    Bonastre, Jean-Francois
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2009,