Estimation of general identifiable linear dynamic models with an application in speech recognition

被引:0
|
作者
Tsontzos, G. [1 ]
Diakoloukas, V. [1 ]
Koniaris, Ch. [1 ]
Digalakis, V. [1 ]
机构
[1] Tech Univ Crete, Dept Elect & Comp Engn, GR-73100 Khania, Greece
关键词
speech recognition; modeling; identification;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although Hidden Markov Models (HMMs) provide a relatively efficient modeling framework for speech recognition, they suffer from several shortcomings which set upper bounds in the performance that can be achieved. Alternatively, linear dynamic models (LDM) can be used to model speech segments. Several implementations of LDM have been proposed in the literature. However, all had a restricted structure to satisfy identifiability constraints. In this paper, we relax all these constraints and use a general, canonical form for a linear state-space system that guarantees identifiability for arbitrary state and observation vector dimensions. For this system, we present a novel, element-wise Maximum Likelihood (ML) estimation method. Classification experiments on the AURORA2 speech database show performance gains compared to HMMs, particularly on highly noisy conditions.
引用
收藏
页码:453 / +
页数:2
相关论文
共 50 条
  • [31] Efficient estimation of general linear mixed effects models
    Demidenko, E
    Stukel, TA
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2002, 104 (01) : 197 - 219
  • [32] ESTIMATION OF RANDOM STATES IN GENERAL LINEAR-MODELS
    CATLIN, DE
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1991, 36 (02) : 248 - 252
  • [33] EFFECTIVE ATTENTION MECHANISM IN DYNAMIC MODELS FOR SPEECH EMOTION RECOGNITION
    Hsiao, Po-Wei
    Chen, Chia-Ping
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2526 - 2530
  • [34] DYNAMIC SPEECH EMOTION RECOGNITION WITH STATE-SPACE MODELS
    Markov, Konstantin
    Matsui, Tomoko
    Septier, Francois
    Peters, Gareth
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2077 - 2081
  • [35] Transducer-based Speech Recognition with Dynamic Language Models
    Georges, Munir
    Kanthak, Stephan
    Klakow, Dietrich
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 642 - 646
  • [36] DEVELOPMENT OF WALSH LINEAR CODING AND ITS APPLICATION TO SPEECH RECOGNITION
    FELDMAN, FA
    HAQUE, T
    SPEECH COMMUNICATION, 1991, 10 (01) : 91 - 97
  • [37] Improved generalization of MCE parameter estimation with application to speech recognition
    Purnell, DW
    Botha, EC
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (04): : 232 - 239
  • [38] ALGONQUIN - Learning dynamic noise models from noisy speech for robust speech recognition
    Frey, BJ
    Kristjansson, TT
    Deng, L
    Acero, A
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1165 - 1171
  • [39] Continuous Estimation of Emotions in Speech by Dynamic Cooperative Speaker Models
    Mencattini, Arianna
    Martinelli, Eugenio
    Ringeval, Fabien
    Schuller, Bjoern
    Di Natale, Corrado
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2017, 8 (03) : 314 - 327
  • [40] Continuous speech recognition based on general factor dependent acoustic models
    Suzuki, H
    Zen, H
    Nankaku, Y
    Miyajima, C
    Tokuda, K
    Kitamura, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03): : 410 - 417