Phrasal Cohort Based Unsupervised Discriminative Language Modeling

被引:0
|
作者
Xu, Puyang [1 ]
Khudanpur, Sanjeev [1 ]
Roark, Brian
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
来源
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年
关键词
unsupervised training; discriminative language modeling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Simulated confusions enable the use of large text-only corpora for discriminative language modeling by hallucinating the likely recognition outputs that each (correct) sentence would be confused with. In [1], a novel approach was introduced to simulate confusions using phrasal cohorts derived directly from recognition output. However, the described approach relied on transcribed speech to derive cohorts. In this paper, we extend the phrasal cohort technique to the fully unsupervised scenario, where transcribed data are completely absent. Experimental results show that even if the cohorts are extracted from untranscribed speech, the unsupervised training can still achieve over 40% of the gains of the supervised approach. The results are presented on NIST data sets for a state-of-the-art LVCSR system.
引用
收藏
页码:198 / 201
页数:4
相关论文
共 50 条
  • [1] MT-BASED ARTIFICIAL HYPOTHESIS GENERATION FOR UNSUPERVISED DISCRIMINATIVE LANGUAGE MODELING
    Dikici, Erinc
    Saraclar, Murat
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1401 - 1405
  • [2] Unsupervised Discriminative Language Modeling Using Error Rate Estimator
    Oba, Takanobu
    Ogawa, Atsunori
    Hori, Takaaki
    Masataki, Hirokazu
    Nakamura, Atsushi
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1222 - 1226
  • [3] UNSUPERVISED DISCRIMINATIVE LANGUAGE MODEL TRAINING
    Dikici, Erinc
    Saraclar, Murat
    2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 1158 - 1161
  • [4] Utterance classification with discriminative language modeling
    Saraçlar, M
    Roark, B
    SPEECH COMMUNICATION, 2006, 48 (3-4) : 276 - 287
  • [5] Improving Unsupervised Language Model Adaptation with Discriminative Data Filtering
    Chang, Shuangyu
    Levit, Michael
    Parthasarathy, Partha
    Dumoulin, Benoit
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1207 - 1211
  • [6] CONTINUOUS SPACE DISCRIMINATIVE LANGUAGE MODELING
    Xu, P.
    Khudanpur, S.
    Lehr, M.
    Prud'hommeaux, E.
    Glenn, N.
    Karakos, D.
    Roark, B.
    Sagae, K.
    Saraclar, M.
    Shafran, I.
    Bikel, D.
    Callison-Burch, C.
    Cao, Y.
    Hall, K.
    Hasler, E.
    Koehn, P.
    Lopez, A.
    Post, M.
    Riley, D.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2129 - 2132
  • [7] Unsupervised discriminative projection based on contrastive learning
    Yang, Jingwen
    Zhang, Hongjie
    Zhou, Ruojin
    Hao, Zhuangzhuang
    Jing, Ling
    KNOWLEDGE-BASED SYSTEMS, 2024, 301
  • [8] Unsupervised Latent Speaker Language Modeling
    Tam, Yik-Cheung
    Vozila, Paul
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1488 - 1491
  • [9] Unsupervised Accent Modeling for Language Identification
    Martinez Gonzalez, David
    Villalba Lopez, Jesus
    Lleida Solano, Eduardo
    Ortega Gimenez, Alfonso
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2014, 2014, 8854 : 49 - 58
  • [10] Discriminative Bilinear Language Modeling for Broadcast Transcriptions
    Kobayashi, Akio
    Ichiki, Manon
    Oku, Takahiro
    Onoe, Kazuo
    Sato, Shoei
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 453 - 457