CONTEXT-DEPENDENT CONNECTIONIST PROBABILITY ESTIMATION IN A HYBRID HIDDEN MARKOV MODEL NEURAL-NET SPEECH RECOGNITION SYSTEM

被引：22

作者：

FRANCO, H

COHEN, M

MORGAN, N

RUMELHART, D

ABRASH, V

机构：

[1] INT COMP SCI INST,BERKELEY,CA 94704

[2] STANFORD UNIV,STANFORD,CA 94305

来源：

COMPUTER SPEECH AND LANGUAGE | 1994年 / 8卷 / 03期

关键词：

D O I：

10.1006/csla.1994.1010

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we present a training method and a network architecture for estimating context-dependent observation probabilities in the framework of a hybrid hidden Markov model (HMM)/multi layer perceptron (MLP) speaker-independent continuous speech recognition system. The context-dependent modeling approach we present here computes the HMM context-dependent observation probabilities using a Bayesian factorization in terms of context-conditioned posterior phone probabilities which are computed with a set of MLPs, one for every relevant context. The proposed network architecture shares the input-to-hidden layer among the set of context dependent MLPs in order to reduce the number of independent parameters. Multiple states for phone models with different context dependence for each state are used to model the different context effects at the beginning and end of phonetic segments. A new training procedure that ''smooths'' networks with different degrees of context depedence is proposed to obtain a robust estimate of the context-dependent probabilities. We have used this new architecture to model generalized biphone phonetic contexts. Tests with the speaker-independent DARPA Resource Management database have shown average reductions in word error rates of 28% using a word-pair grammar, compared to our earlier context-independent HMM/MLP hybrid.

引用

页码：211 / 222

页数：12

共 50 条

[31] Hybrid Model of Continuous Hidden Markov Model and Multi-Layer Perceptron in Speech Recognition
Zhang, Peiling
Li, Hui
ICICTA: 2009 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL II, PROCEEDINGS, 2009, : 62 - 65
[32] Wavelet transform to hybrid support vector machine and hidden Markov model for speech recognition
Shao, Y
Chang, CH
2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 3833 - 3836
[33] SEGMENTAL CORRECTIVE TRAINING FOR HIDDEN MARKOV MODEL PARAMETER-ESTIMATION IN SPEECH RECOGNITION
KIM, HR
LEE, HS
ELECTRONICS LETTERS, 1991, 27 (18) : 1633 - 1635
[34] Hybrid support vector machine hidden Markov model approach for continuous speech recognition
Chakrabartty, S
Singh, G
Cauwenberghs, G
PROCEEDINGS OF THE 43RD IEEE MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I-III, 2000, : 828 - 831
[35] Phone-dependent channel compensated hidden Markov model for telephone speech recognition
Chien, JT
Wang, HC
IEEE SIGNAL PROCESSING LETTERS, 1998, 5 (06) : 143 - 145
[36] Context-dependent hybrid HME/HMM speech recognition using polyphone clustering decision trees
Fritsch, J
Finke, M
Waibel, A
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1759 - 1762
[37] Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese
Wu, JJX
Deng, L
Chan, J
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2281 - 2284
[38] Offline handwritten word recognition using a hybrid neural network and Hidden Markov model
Tay, YH
Lallican, PM
Khalid, M
Viard-Gaudin, C
Knerr, S
ISSPA 2001: SIXTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2001, : 382 - 385
[39] Context-dependent acoustic modeling based on hidden maximum entropy model for statistical parametric speech synthesis
Khorram, Soheil
Sameti, Hossein
Bahmaninezhad, Fahimeh
King, Simon
Drugman, Thomas
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
[40] Context-dependent acoustic modeling based on hidden maximum entropy model for statistical parametric speech synthesis
Soheil Khorram
Hossein Sameti
Fahimeh Bahmaninezhad
Simon King
Thomas Drugman
EURASIP Journal on Audio, Speech, and Music Processing, 2014

← 1 2 3 4 5 →