CRIM'S FRENCH SPEECH TRANSCRIPTION SYSTEM FOR ETAPE 2011

被引:0
|
作者
Gupta, Vishwa [1 ]
Boulianne, Gilles [1 ]
Osterrath, Frederic [1 ]
Ouellet, Pierre [1 ]
机构
[1] CRIM, Montreal, PQ, Canada
来源
2013 8TH INTERNATIONAL WORKSHOP ON SYSTEMS, SIGNAL PROCESSING AND THEIR APPLICATIONS (WOSSPA) | 2013年
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper describes the French broadcast speech transcription system by CRIM for the ETAPE 2011 evaluation. The key elements in this recognizer include over 140,000-word dictionary, 478 hours of audio for training the acoustic models, feature-space MMI and boosted MMI discriminative training of the acoustic models, variable-frame-rate decoding with trigram language model, lattice rescoring with quadgram language model, soft penalty on silence models, confusion network decoding with minimum Bayes risk, and combining multiple recognizers with ROVER. Recognition enhancements after the ETAPE evaluation include discriminative training of the subspace Gaussian mixture models and lattice rescoring with neural net language models.
引用
收藏
页码:351 / 356
页数:6
相关论文
共 50 条
  • [21] THE IBM 2009 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
    Kingsbury, Brian
    Soltau, Hagen
    Saon, George
    Chu, Stephen
    Kuo, Hong-Kwang
    Mangu, Lidia
    Ravuri, Suman
    Morgan, Nelson
    Janin, Adam
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4672 - 4675
  • [22] Slovak Broadcast News Speech Recognition and Transcription System
    Lojka, Martin
    Viszlay, Peter
    Stas, Jan
    Hladek, Daniel
    Juhar, Jozef
    ADVANCES IN NETWORK-BASED INFORMATION SYSTEMS, NBIS-2018, 2019, 22 : 385 - 394
  • [23] THE IBM 2008 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
    Saon, George
    Soltau, Hagen
    Chaudhari, Upendra
    Chu, Stephen
    Kingsbury, Brian
    Kuo, Hong-Kwang
    Mangu, Lidia
    Povey, Daniel
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4378 - 4381
  • [24] The 1998 HTK system for transcription of conversational telephone speech
    Hain, T
    Woodland, PC
    Niesler, TR
    Whittaker, EWD
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 57 - 60
  • [25] The body, movement and speech (French) - Cady,S
    不详
    EVOLUTION PSYCHIATRIQUE, 1995, 60 (04): : 947 - 947
  • [26] CRIM’s content-based audio copy detection system for TRECVID 2009
    Vishwa Nath Gupta
    Gilles Boulianne
    Patrick Cardinal
    Multimedia Tools and Applications, 2012, 60 : 371 - 387
  • [27] CRIM's content-based audio copy detection system for TRECVID 2009
    Gupta, Vishwa Nath
    Boulianne, Gilles
    Cardinal, Patrick
    MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 60 (02) : 371 - 387
  • [28] The unfolding of the verbal temporal system in French children's speech between 18 and 36 months
    Parisse, Christophe
    Morgenstern, Aliyah
    JOURNAL OF FRENCH LANGUAGE STUDIES, 2012, 22 (01) : 95 - 114
  • [29] The 2003 ISL rich transcription system for conversational telephony speech
    Soltau, H
    Yu, H
    Metze, F
    Fügen, C
    Jin, Q
    Jou, SC
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 773 - 776
  • [30] The IBM 2007 speech transcription system for European parliamentary speeches
    Ramabhadran, Bhuvana
    Siohan, Olivier
    Sethy, Abhinav
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 472 - +