Continuous speech recognition based on general factor dependent acoustic models

被引：4

作者：

Suzuki, H ^{[1
]}

Zen, H

Nankaku, Y

Miyajima, C

Tokuda, K

Kitamura, T

机构：

[1] Nagoya Inst Technol, Dept Comp Sci & Engn, Nagoya, Aichi 4668555, Japan

[2] Nagoya Univ, Dept Media Sci, Nagoya, Aichi 4668603, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2005年 / E88D卷 / 03期

关键词：

continuous speech recognition; triphone HMMs; context clustering; Bayesian networks; voice characteristic; noise environment;

D O I：

10.1093/ietisy/e88-d.3.410

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper describes continuous speech recognition incorporating the additional complement information, e.g., voice characteristics, speaking styles, linguistic information and,noise environment, into HMM-based acoustic modeling. In speech recognition systems, context-dependent HMMs, i.e., triphone, and the tree-based context clustering have commonly been used. Several attempts to utilize not only phonetic contexts, but additional complement information based on context (factor) dependent HMMs have been made in recent years. However, when the additional factors for testing data are unobserved, methods for obtaining factor labels is required before decoding. In this paper, we propose a model integration technique based on general factor dependent HMMs for decoding. The integrated HMMs can be used by a conventional decoder as standard triphone HMMs with Gaussian mixture densities. Moreover, by using the results of context clustering, the proposed method can determine an optimal number of mixture components for each state dependently of the degree of influence from additional factors. Phoneme recognition experiments using voice characteristic labels show significant improvements with a small number of model parameters, and a 19.3% error reduction was obtained in noise environment experiments.

引用

页码：410 / 417

页数：8

共 50 条

[1] Large Vocabulary Continuous Speech Recognition With Reservoir-Based Acoustic Models
Triefenbach, Fabian
Demuynck, Kris
Martens, Jean-Pierre
IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (03) : 311 - 315
[2] Context-dependent acoustic models for Chinese speech recognition
Ma, B
Huang, TY
Xu, B
Zhang, XJ
Qu, F
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 455 - 458
[3] Phone-context specific gender-dependent acoustic-models for continuous speech recognition
Neti, C
Roukos, S
1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 192 - 198
[4] Context Dependent Syllable Acoustic Model for Continuous Chinese Speech Recognition
Wu, Hao
Wu, Xihong
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1961 - 1964
[5] Acoustic models of the elderly for large-vocabulary continuous speech recognition
Baba, A
Yoshizawa, S
Yamada, M
Lee, A
Shikano, K
ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2004, 87 (07): : 49 - 57
[6] Development & evaluation of different acoustic models for Malayalam continuous speech recognition
Kurian, Cini
Balakrishnan, Kannan
INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY AND SYSTEM DESIGN 2011, 2012, 30 : 1081 - 1088
[7] Unsupervised training of acoustic models for large vocabulary continuous speech recognition
Wessel, F
Ney, H
ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 307 - 310
[8] Speech recognition using voice-characteristic-dependent acoustic models
Suzuki, H
Zen, H
Nankaku, Y
Miyajima, C
Tokuda, K
Kitamura, T
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 740 - 743
[9] Context dependent initial/final acoustic modeling for continuous Chinese speech recognition
Li, Jing
Zheng, Fang
Zhang, Jiyong
Wu, Wenhu
Qinghua Daxue Xuebao/Journal of Tsinghua University, 2004, 44 (01): : 61 - 64
[10] Free Acoustic and Language Models for Large Vocabulary Continuous Speech Recognition in Swedish
Vanhainen, Niklas
Salvi, Giampiero
LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,

← 1 2 3 4 5 →