A hybrid statistical model to generate pronunciation variants of words

被引：0

作者：

Vazirnezhad, B ^{[1
]}

Almasganj, F ^{[1
]}

Bijankhan, M ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Biomed Engn Fac, Tehran, Iran

来源：

PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05) | 2005年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generating pronunciation variants of words is an important applicable subject in speech researches and is used extensively in automatic speech segmentation and recognition systems. In this way, Decision trees are extremely used to model pronunciation variants of words and sub-word unites. In the case of word unites and very large vocabulary, to train necessary decision trees we will need a huge amount of speech utterances which contains all of the needed words with a sufficient number of each one. This approach besides demanding very large data, for new words will need some new extra corpus. To solve these problems we have used generalized decision trees, that each tree is trained for a group of words with similar phonemic structure instead of a single word. These trees can predict regions of the words in which substitution, deletion and insertion of phonemes would occur. Next to this step, appropriate statistical contextual rules, which are extracted from a large speech corpus, will be applied to these regions in order to generate words variants. This new hybrid d-tree/c-rule approach takes into account word phonological structures, stress, and phone context information simultaneously and an ordinary size speech corpus will be sufficient to train its models. By using the word variants obtained by this method in the lexicon of "SHENAVA", a Persian ACSR, a relative WER% reduction of as high as 6% was obtained.

引用

页码：106 / 110

页数：5

共 50 条

[41] Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis
Qader, Raheel
Lecorve, Gwenole
Lolive, Damien
Tahon, Marie
Sebillot, Pascale
TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 92 - 101
[42] Using multilingual units for improved modeling of pronunciation variants
Bartkova, K.
Jouvet, D.
2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 5895 - 5898
[43] Pronunciation instruction through Twitter: the case of commonly mispronounced words
Fouz-Gonzalez, Jonas
COMPUTER ASSISTED LANGUAGE LEARNING, 2017, 30 (07) : 631 - 663
[44] A method for determining alternative Chinese words based on pronunciation similarity
Res. Discl., 2006, 509 (1256):
[45] Semantic context effects in the comprehension of reduced pronunciation variants
van de Ven, Marco
Tucker, Benjamin V.
Ernestus, Mirjam
MEMORY & COGNITION, 2011, 39 (07) : 1301 - 1316
[46] Exploring the role of exposure frequency in recognizing pronunciation variants
Pitt, Mark A.
Dilley, Laura
Tat, Michael
JOURNAL OF PHONETICS, 2011, 39 (03) : 304 - 311
[47] Comparing SMT Methods for Automatic Generation of Pronunciation Variants
Karanasou, Panagiota
Lamel, Lori
ADVANCES IN NATURAL LANGUAGE PROCESSING, 2010, 6233 : 167 - 178
[48] The Strength and Time Course of Lexical Activation of Pronunciation Variants
Pitt, Mark A.
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2009, 35 (03) : 896 - 910
[49] Synthetic method to generate α-μ distributed variants
Fathi, Y.
Tawfik, H.
ELECTRONICS LETTERS, 2015, 51 (05) : 393 - +
[50] THE TRANSCRIPTION AND PRONUNCIATION OF GREEK WORDS IN LATIN - FRENCH - BIVILLE,F
PENNEY, JHW
CLASSICAL REVIEW, 1993, 43 (02): : 320 - 321

← 1 2 3 4 5 →