A hybrid statistical model to generate pronunciation variants of words

被引:0
|
作者
Vazirnezhad, B [1 ]
Almasganj, F [1 ]
Bijankhan, M [1 ]
机构
[1] Amirkabir Univ Technol, Biomed Engn Fac, Tehran, Iran
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generating pronunciation variants of words is an important applicable subject in speech researches and is used extensively in automatic speech segmentation and recognition systems. In this way, Decision trees are extremely used to model pronunciation variants of words and sub-word unites. In the case of word unites and very large vocabulary, to train necessary decision trees we will need a huge amount of speech utterances which contains all of the needed words with a sufficient number of each one. This approach besides demanding very large data, for new words will need some new extra corpus. To solve these problems we have used generalized decision trees, that each tree is trained for a group of words with similar phonemic structure instead of a single word. These trees can predict regions of the words in which substitution, deletion and insertion of phonemes would occur. Next to this step, appropriate statistical contextual rules, which are extracted from a large speech corpus, will be applied to these regions in order to generate words variants. This new hybrid d-tree/c-rule approach takes into account word phonological structures, stress, and phone context information simultaneously and an ordinary size speech corpus will be sufficient to train its models. By using the word variants obtained by this method in the lexicon of "SHENAVA", a Persian ACSR, a relative WER% reduction of as high as 6% was obtained.
引用
收藏
页码:106 / 110
页数:5
相关论文
共 50 条
  • [31] STATISTICAL WORDS
    ROTHSTEIN, JM
    PHYSICAL THERAPY, 1995, 75 (02): : 82 - 83
  • [32] Developing Consistent Pronunciation Models for Phonemic Variants
    Davel, Marelie
    Barnard, Etienne
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1260 - 1263
  • [33] APPARENT FREQUENCY OF WORDS AND PICTURES AS A FUNCTION OF PRONUNCIATION AND IMAGERY
    GHATALA, ES
    LEVIN, JR
    WILDER, L
    JOURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR, 1973, 12 (01): : 85 - 90
  • [34] Learning New Words Affects Nonword Pronunciation in Children
    Khanna, Maya M.
    Cortese, Michael J.
    Birchwood, Katharine S.
    SCIENTIFIC STUDIES OF READING, 2010, 14 (05) : 407 - 439
  • [35] Broad and flat A in marked words (American speech, pronunciation)
    Shapiro, M
    AMERICAN SPEECH, 1997, 72 (04) : 437 - 439
  • [36] Is there only one "fenetre" in the production lexicon? On-line evidence on the nature of phonological representations of pronunciation variants for French schwa words
    Burki, Audrey
    Ernestus, Mirjam
    Frauenfelder, Ulrich H.
    JOURNAL OF MEMORY AND LANGUAGE, 2010, 62 (04) : 421 - 437
  • [37] Statistical Shape Model to Generate a Planning Library for Cervical Adaptive Radiotherapy
    Rigaud, Bastien
    Simon, Antoine
    Gobeli, Maxime
    Leseur, Julie
    Duverge, Loig
    Williaume, Daniele
    Castelli, Joel
    Lafond, Caroline
    Acosta, Oscar
    Haigron, Pascal
    De Crevoisier, Renaud
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (02) : 406 - 416
  • [38] Statistical analysis of a hybrid replication model
    de Oliveira, ER
    Porto, IJ
    DESIGN AND ANALYSIS OF DISTRIBUTED EMBEDDED SYSTEMS, 2002, 91 : 91 - 100
  • [39] A hybrid model for sense guessing of Chinese unknown words
    Department of Chinese Language and Literature, Peking University, China
    不详
    PACLIC 23 - Proc. 23rd Pacific Asia Conf. Lang. Inf. Comput., 2009, (464-473):
  • [40] A Hybrid Model for Chinese Confusable Words Distinguishing in Proofreading
    Li, Luozheng
    Song, Peipei
    Zhang, Dan
    Zhao, Dongyan
    CHINESE LEXICAL SEMANTICS, CLSW 2021, PT I, 2022, 13249 : 464 - 473