Simplified neural network architectures for a hybrid speech recognition system with small vocabulary size

被引:0
|
作者
Sedarat, H [1 ]
Khadem, R [1 ]
Franco, H [1 ]
机构
[1] Stanford Univ, Dept Elect Engn, Informat Syst Lab, Stanford, CA 94305 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recent studies suggest that a hybrid speech recognition system based on a hidden Markov model (HMM) with a neural network (NN) subsystem as the estimator of the state conditional observation probability may have some advantages over the conventional HMMs with Gaussian mixture models for the observation probabilities. The HMM and NN modules are typically treated as separate entities in a hybrid system. This paper, however, suggests that the a priori knowledge of HMM structure can be beneficial in the design of the NN subsystem. A case of isolated word recognition is studied to demonstrate that a substantially simplified NN can be achieved in a structured HMM by applying a Bayesian factorization and pre-classification. The results indicate a similar performance to that obtained with the classical approach with much less complexity in NN structure.
引用
收藏
页码:1113 / 1116
页数:4
相关论文
共 50 条
  • [1] Simplified neural network architecture for a hybrid speech recognition system
    Deng, Wei
    Shu Ju Cai Ji Yu Chu Li/Journal of Data Acquisition and Processing, 2002, 17 (01):
  • [2] Evolution of Neural Network Architectures for Speech Recognition
    Bourlard, Herve
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1767 - 1767
  • [3] Parallel convolutional neural network and hybrid architectures for accented speech recognition in Malayalam
    Rizwana Kallooravi Thandil
    V. K. Muneer
    B. Premjith
    Iran Journal of Computer Science, 2025, 8 (1) : 125 - 149
  • [4] Size matters: An empirical study of neural network training for large vocabulary continuous speech recognition
    Ellis, D
    Morgan, N
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 1013 - 1016
  • [5] Large Vocabulary Speech Recognition on Parallel Architectures
    Cardinal, Patrick
    Dumouchel, Pierre
    Boulianne, Gilles
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (11): : 2290 - 2300
  • [6] A hybrid neural network based speech recognition system for pervasive environments
    Sehgal, MSB
    Gondal, I
    Dooley, L
    INMIC 2004: 8th International Multitopic Conference, Proceedings, 2004, : 309 - 314
  • [7] Speech Emotion Recognition with Hybrid Neural Network
    Wei, Chuanzheng
    Sun, Xiao
    Tian, Fang
    Ren, Fuji
    5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM 2019), 2019, : 298 - 302
  • [8] Deep neural network architectures for dysarthric speech analysis and recognition
    Brahim Fares Zaidi
    Sid Ahmed Selouani
    Malika Boudraa
    Mohammed Sidi Yakoub
    Neural Computing and Applications, 2021, 33 : 9089 - 9108
  • [9] Deep neural network architectures for dysarthric speech analysis and recognition
    Zaidi, Brahim Fares
    Selouani, Sid Ahmed
    Boudraa, Malika
    Sidi Yakoub, Mohammed
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15): : 9089 - 9108
  • [10] The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition
    Yu, Dong
    Deng, Li
    Seide, Frank
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 388 - 396