A speaker adaptive Chinese syllable recognition system based on discriminative training

被引：0

作者：

Zhou, L

Imai, S

机构：

来源：

1996 IEEE TENCON - DIGITAL SIGNAL PROCESSING APPLICATIONS PROCEEDINGS, VOLS 1 AND 2 | 1996年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we present two speaker adaptation methods to implement a MSVQ-based adaptive Chinese syllable recognition system. The first proposed method is feature normalization in which we model the inter-speaker variability as a linear transformation. By applying the feature normalization, the target speaker speech is normalized to reduce the inter-speaker acoustic variability. In the second adaptation method, we first present an implementation of the MCE/GPD algorithm for discriminatively training MSVQ-based speech recognizer. It is expected that this method can separate the confusion classes and can enhance speaker adaptation capability. We carried out recognition experiments to assess the performance by using standard Chinese syllable database CRDB in China, the results show that when both adaptation methods are combined, the error rate reduction on open data is over 62% with a single set of adaptation training data. When increasing training data, the capability of speaker adaptation is improved using the MCE/GPD training only. After using 5 sets of training data, the average recognition rate for two new speakers was improved from 72.87% to 97.31% which is best performance reported in this database.

引用

页码：31 / 36

页数：6

共 50 条

[31] Speaker-Independent Silent Speech Recognition with Across-Speaker Articulatory Normalization and Speaker Adaptive Training
Wang, Jun
Hahm, Seongjun
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2415 - 2419
[32] Eigenspace-based MLLR with speaker adaptive training in large vocabulary conversational speech recognition
Doumpiotis, V
Deng, YG
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 357 - 360
[33] Multimodal 3D Visible Articulation System for Syllable Based Mandarin Chinese Training
Li, Rui
Yu, Jun
2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2017,
[34] A speaker recognition system based on VQ
Zhao Yanling
Zheng Xiaoshi
Gao Huixian
Li Na
ICIEA 2008: 3RD IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, PROCEEDINGS, VOLS 1-3, 2008, : 1988 - 1990
[35] A CEPSTRAL BASED SPEAKER RECOGNITION SYSTEM
SETHURAMAN, R
GOWDY, JN
PROCEEDINGS : THE TWENTY-FIRST SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 1989, : 503 - 507
[36] Incremental speaker adaptation with minimum error discriminative training for speaker identification
delAlamo, CM
Alvarez, J
delaTorre, C
Poyatos, FJ
Hernandez, L
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1760 - 1763
[37] Margin-Based Discriminative Training for String Recognition
Heigold, Georg
Dreuw, Philippe
Hahn, Stefan
Schlueter, Ralf
Ney, Hermann
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (06) : 917 - 925
[38] A hybrid syllable recognition system based on vowel spotting
Sirigos, J
Fakotakis, N
Kokkinakis, G
SPEECH COMMUNICATION, 2002, 38 (3-4) : 427 - 440
[39] FACTOR ANALYSIS BASED VTS DISCRIMINATIVE ADAPTIVE TRAINING
Flego, F.
Gales, M. J. F.
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4669 - 4672
[40] Chinese speaker-recognition based on ARMA model
LIN Baocheng
CHEN Yongbin(Dept. of Radio Engineering
Chinese Journal of Acoustics, 1998, (03) : 206 - 212

← 1 2 3 4 5 →