Random clusterings for language modeling

被引:0
|
作者
Emami, A [1 ]
Jelinek, F [1 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present an application of randomization techniques to class-based n-gram language models. The idea is to derive a language model from the combination of a set of random class-based models. Each of the constituent random class-based models is built using a separate clustering obtained via a different run of a randomized clustering algorithm. The random class-based model can compensate for some of the shortcomings of conventional class-based models by combining the different solutions obtained through random clusterings. Experimental results show that the combined random class-based model improves considerably in perplexity (PPL) and word error rate (WER) over both the n-gram and baseline class-based models.
引用
收藏
页码:581 / 584
页数:4
相关论文
共 50 条
  • [1] Language modeling experiments with random forests
    Jelinek, F
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2005, 3658 : 1 - 1
  • [2] EXPLOITING DIFFERENT WORD CLUSTERINGS FOR CLASS-BASED RNN LANGUAGE MODELING IN SPEECH RECOGNITION
    Song, Minguang
    Zhao, Yunxin
    Wang, Shaojun
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5735 - 5739
  • [3] Trans-dimensional Random Fields for Language Modeling
    Wang, Bin
    Ou, Zhijian
    Tan, Zhiqiang
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 785 - 794
  • [4] Trans-dimensional random fields for language modeling
    Department of Electronic Engineering, Tsinghua University, Beijing
    100084, China
    不详
    NJ
    08854, United States
    ACL-IJCNLP - Annu. Meet. Assoc. Comput. Linguist. Int. Jt. Conf. Nat. Lang. Process. Asian Fed. Nat. Lang. Process., Proc. Conf., (785-794):
  • [5] Random forests and the data sparseness problem in language modeling
    Xu, Peng
    Jelinek, Frederick
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (01): : 105 - 152
  • [6] MORPHOLOGICAL RANDOM FORESTS FOR LANGUAGE MODELING OF INFLECTIONAL LANGUAGES
    Oparin, Ilya
    Glembek, Ondrej
    Burget, Lukas
    Cernocky, Jan
    2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 189 - +
  • [7] A RANDOM GOSSIP BMUF PROCESS FOR NEURAL LANGUAGE MODELING
    Huang, Yiheng
    Tian, Jinchuan
    Han, Lei
    Wang, Guangsen
    Song, Xingchen
    Su, Dan
    Yu, Dong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7959 - 7963
  • [8] On the limiting characteristics of quantum random number generators at various clusterings of photocounts
    S. N. Molotkov
    JETP Letters, 2017, 105 : 395 - 401
  • [9] On the Limiting Characteristics of Quantum Random Number Generators at Various Clusterings of Photocounts
    Molotkov, S. N.
    JETP LETTERS, 2017, 105 (06) : 395 - 401
  • [10] LANGUAGE MODELING WITH NEURAL TRANS-DIMENSIONAL RANDOM FIELDS
    Wang, Bin
    Ou, Zhijian
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 294 - 300