Long-Distance Continuous Space Language Modeling for Speech Recognition

被引：0

作者：

Talaat, Mohamed ^{[1
]}

Abdou, Sherif ^{[1
]}

Shoman, Mahmoud ^{[1
]}

机构：

[1] Cairo Univ, Fac Comp & Informat, Giza 12613, Egypt

来源：

COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II | 2015年 / 9042卷

关键词：

Language model; n-gram; Continuous space; Latent semantic analysis; Word co-occurrence matrix; Long distance; Tied-mixture model; HYBRID;

D O I：

10.1007/978-3-319-18117-2_41

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The n-gram language models has been the most frequently used language model for a long time as they are easy to build models and require the minimum effort for integration in different NLP applications. Although of its popularity, n-gram models suffer from several drawbacks such as its ability to generalize for the unseen words in the training data, the adaptability to new domains, and the focus only on short distance word relations. To overcome the problems of the n-gram models the continuous parameter space LMs were introduced. In these models the words are treated as vectors of real numbers rather than of discrete entities. As a result, semantic relationships between the words could be quantified and can be integrated into the model. The infrequent words are modeled using the more frequent ones that are semantically similar. In this paper we present a long distance continuous language model based on a latent semantic analysis (LSA). In the LSA framework, the word-document co-occurrence matrix is commonly used to tell how many times a word occurs in a certain document. Also, the word-word co-occurrence matrix is used in many previous studies. In this research, we introduce a different representation for the text corpus, this by proposing long-distance word co-occurrence matrices. These matrices to represent the long range co-occurrences between different words on different distances in the corpus. By applying LSA to these matrices, words in the vocabulary are moved to the continuous vector space. We represent each word with a continuous vector that keeps the word order and position in the sentences. We use tied-mixture HMM modeling (TM-HMM) to robustly estimate the LM parameters and word probabilities. Experiments on the Arabic Gigaword corpus show improvements in the perplexity and the speech recognition results compared to the conventional n-gram.

引用

页码：549 / 564

页数：16

共 50 条

[31] CONTINUOUS SPACE DISCRIMINATIVE LANGUAGE MODELING
Xu, P.
Khudanpur, S.
Lehr, M.
Prud'hommeaux, E.
Glenn, N.
Karakos, D.
Roark, B.
Sagae, K.
Saraclar, M.
Shafran, I.
Bikel, D.
Callison-Burch, C.
Cao, Y.
Hall, K.
Hasler, E.
Koehn, P.
Lopez, A.
Post, M.
Riley, D.
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2129 - 2132
[32] Improved lexicon modeling for continuous speech recognition
Yun, SJ
Oh, YH
Shin, GC
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1827 - 1830
[33] Context modeling and clustering in continuous speech recognition
Junqua, JC
Vassallo, L
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2262 - 2265
[34] Tone Modeling for Continuous Mandarin Speech Recognition
Cao, Yang
Zhang, Shuwu
Huang, Taiyi
Xu, Bo
International Journal of Speech Technology, 2004, 7 (2-3) : 115 - 128
[35] Modeling long-distance dispersal of plant diaspores by wind
Tackenberg, O
ECOLOGICAL MONOGRAPHS, 2003, 73 (02) : 173 - 189
[36] Joint acoustic and language modeling for speech recognition
Chien, Jen-Tzung
Chueh, Chuang-Hua
SPEECH COMMUNICATION, 2010, 52 (03) : 223 - 235
[37] Language Modeling for Speech Recognition of Spoken Cantonese
Yeung, Yu Ting
Cao, Houwei
Zheng, N. H.
Lee, Tan
Ching, P. C.
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1570 - 1573
[38] Audio-Visual Speech Modeling for Continuous Speech Recognition
Dupont, Stephane
Luettin, Juergen
IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) : 141 - 151
[39] POSITION INFORMATION FOR LANGUAGE MODELING IN SPEECH RECOGNITION
Chiu, Hsuan-Sheng
Chen, Guan-Yu
Lee, Chun-Jen
Chen, Berlin
2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 101 - 104
[40] Latent semantic language modeling for speech recognition
Bellegarda, JR
MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING, 2004, 138 : 73 - 103

← 1 2 3 4 5 →