Language model adaptation using mixtures and an exponentially decaying cache

被引:0
|
作者
Clarkson, PR
Robinson, AJ
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents two techniques for language model adaptation. The first is based on the use of mixtures of language models: the training text is partitioned according to topic, a language model is constructed for each component, and at recognition time appropriate weightings are assigned to each component to model the observed style of language. The second technique is based on augmenting the standard trigram model with a cache component in which words recurrence probabilities decay exponentially over time. Both techniques yield a significant reduction in perplexity over the baseline trigram language model when faced with multi-domain test text, the mixture-based model giving a 24% reduction and the cache-based model giving a 14% reduction. The two techniques attack the problem of adaptation at different scales, and as a result can be used in parallel to give a total perplexity reduction of 30%.
引用
收藏
页码:799 / 802
页数:4
相关论文
共 50 条
  • [31] Data augmentation and language model adaptation using singular value decomposition
    Béchet, F
    De Mori, R
    Janiszek, D
    PATTERN RECOGNITION LETTERS, 2004, 25 (01) : 15 - 19
  • [32] LANGUAGE MODEL COMBINATION AND ADAPTATION USING WEIGHTED FINITE STATE TRANSDUCERS
    Liu, X.
    Gales, M. J. F.
    Hieronymus, J. L.
    Woodland, P. C.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5390 - 5393
  • [33] An empirical study on language model adaptation using a metric of domain similarity
    Yuan, W
    Gao, JF
    Suzuki, H
    NATURAL LANGUAGE PROCESSING - IJCNLP 2005, PROCEEDINGS, 2005, 3651 : 957 - 968
  • [34] VOCABULARY AND LANGUAGE MODEL ADAPTATION USING JUST ONE SPEECH FILE
    Meng, S.
    Thambiratnam, K.
    Lin, Y.
    Wang, L.
    Li, G.
    Seide, F.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5410 - 5413
  • [35] Language Model Adaptation for Emotional Speech Recognition using Tweet data
    Saeki, Kazuya
    Kato, Masaharu
    Kosaka, Tetsuo
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 371 - 375
  • [36] On analysis of exponentially decaying pulse signals using stochastic volatility model.: Part II:: Student-t distribution (L)
    Chan, C. M.
    Tang, S. K.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 120 (04): : 1783 - 1786
  • [37] UNSUPERVISED LANGUAGE MODEL ADAPTATION USING N-GRAM WEIGHTING
    Haidar, Md. Akmal
    O'Shaughnessy, Douglas
    2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 857 - 860
  • [38] Exponentially decaying DC offset removal for phasor measurement using second-order differential
    Chen, Ya
    Ji, Tianyao
    Wu, Qinghua
    Li, Mengshi
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2015, 10 (06) : 726 - 728
  • [39] Language model adaptation for language and dialect identification of text
    Jauhiainen, T.
    Linden, K.
    Jauhiainen, H.
    NATURAL LANGUAGE ENGINEERING, 2019, 25 (05) : 561 - 583