Investigations of Issues for Using Multiple Acoustic Models to Improve Continuous Speech Recognition

被引:0
|
作者
Zhang, Rong [1 ]
Rudnicky, Alexander I. [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
Boosting; ROVER;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates two important issues in constructing and combining ensembles of acoustic models for reducing recognition errors. First, we investigate the applicability of the AnyBoost algorithm for acoustic model training. AnyBoost is a generalized Boosting method that allows the use of an arbitrary loss function as the training criterion to construct ensemble of classifiers. We choose the MCE discriminative objective function for our experiments. Initial test results on a real-world meeting recognition corpus show that AnyBoost is a competitive alternate to the standard AdaBoost algorithm. Second, we investigate ROVER-based combination, focusing on the technique for selecting correct hypothesized words from aligned WTN. We propose a neural network based insertion detection and word scoring scheme for this. Our approach consistently outperforms the current voting technique used by ROVER in the experiments.
引用
收藏
页码:529 / 532
页数:4
相关论文
共 50 条
  • [1] Investigations on Features for Log-Linear Acoustic Models in Continuous Speech Recognition
    Wiesler, S.
    Nussbaum-Thom, M.
    Heigold, G.
    Schlueter, R.
    Ney, H.
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 52 - 57
  • [2] Combining Acoustic Name Spotting and Continuous Context Models to improve Spoken Person Name Recognition in Speech
    Bigot, Benjamin
    Senay, Gregory
    Linares, Georges
    Fredouille, Corinne
    Dufour, Richard
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2538 - 2542
  • [3] Using multiple acoustic feature sets for speech recognition
    Zolnay, Andras
    Kocharov, Daniil
    Schlueter, Ralf
    Ney, Hermann
    SPEECH COMMUNICATION, 2007, 49 (06) : 514 - 525
  • [4] Acoustic models of the elderly for large-vocabulary continuous speech recognition
    Baba, A
    Yoshizawa, S
    Yamada, M
    Lee, A
    Shikano, K
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2004, 87 (07): : 49 - 57
  • [5] Development & evaluation of different acoustic models for Malayalam continuous speech recognition
    Kurian, Cini
    Balakrishnan, Kannan
    INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY AND SYSTEM DESIGN 2011, 2012, 30 : 1081 - 1088
  • [6] Unsupervised training of acoustic models for large vocabulary continuous speech recognition
    Wessel, F
    Ney, H
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 307 - 310
  • [7] Continuous speech recognition based on general factor dependent acoustic models
    Suzuki, H
    Zen, H
    Nankaku, Y
    Miyajima, C
    Tokuda, K
    Kitamura, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03): : 410 - 417
  • [8] Continuous speech recognition using linear dynamic models
    Ma, Tao
    Srinivasan, Sundararajan
    Lazarou, Georgios
    Picone, Joseph
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (01) : 11 - 16
  • [9] Combining Multiple Acoustic Models in GMM Spaces for Robust Speech Recognition
    Kang, Byung Ok
    Kwon, Oh-Wook
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (03): : 724 - 730
  • [10] Large Vocabulary Continuous Speech Recognition With Reservoir-Based Acoustic Models
    Triefenbach, Fabian
    Demuynck, Kris
    Martens, Jean-Pierre
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (03) : 311 - 315