THE IBM 2008 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM

被引:9
|
作者
Saon, George [1 ]
Soltau, Hagen [1 ]
Chaudhari, Upendra [1 ]
Chu, Stephen [1 ]
Kingsbury, Brian [1 ]
Kuo, Hong-Kwang [1 ]
Mangu, Lidia [1 ]
Povey, Daniel [2 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
[2] Microsoft Res, Redmond, WA USA
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
Speech recognition;
D O I
10.1109/ICASSP.2010.5495640
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes the Arabic broadcast transcription system fielded by IBM in the GALE Phase 3.5 machine translation evaluation. Key advances compared to our Phase 2.5 system include improved discriminative training, the use of Subspace Gaussian Mixture Models (SGMM), neural network acoustic features, variable frame rate decoding, training data partitioning experiments, unpruned n-gram language models and neural network language models. These advances were instrumental in achieving a word error rate of 8.9% on the evaluation test set.
引用
收藏
页码:4378 / 4381
页数:4
相关论文
共 50 条
  • [21] The 2010 CMU GALE Speech-to-Text System
    Metze, Florian
    Hsiao, Roger
    Jin, Qin
    Nallasamy, Udhyakumar
    Schultz, Tanja
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1501 - 1504
  • [22] Advances in speech transcription at IBM under the DARPA EARS program
    Chen, Stanley F.
    Kingsbury, Brian
    Mangu, Lidia
    Povey, Daniel
    Saon, George
    Soltau, Hagen
    Zweig, Geoffrey
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1596 - 1608
  • [23] Arabic broadcast news transcription system
    Alghamdi, Mansour
    Elshafei, Moustafa
    Al-Muhtaseb, Husni
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2007, 10 (04) : 183 - 195
  • [24] Arabic automatic segmentation system and its application for arabic speech recognition system
    Nofal, M
    Abdel-Raheem, E
    Kader, NSA
    Proceedings of the 46th IEEE International Midwest Symposium on Circuits & Systems, Vols 1-3, 2003, : 697 - 700
  • [25] Arabic ASR and MT integration for GALE
    Al-Onaizan, Yaser
    Mangu, Lidia
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1285 - +
  • [26] ARABIC SOFTWARE FOR THE IBM
    RIPPIN, A
    COMPUTERS AND THE HUMANITIES, 1991, 25 (06): : 445 - 448
  • [27] COSEGMENTATION IN THE IBM TEXT-TO-SPEECH SYSTEM
    PICKERING, JB
    PROCEEDINGS : INSTITUTE OF ACOUSTICS, VOL 8, PART 7: SPEECH & HEARING, 1986, 8 : 385 - 392
  • [28] Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
    Alotaibi, Yousef Ajami
    Alghamdi, Mansour
    Alotaiby, Fabad
    IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 122 - +
  • [29] IBM GALE中文识别系统
    张世磊
    施勤
    秦勇
    刘文
    CHU StephenM
    KUO HongKwang
    MANGU Lidia
    清华大学学报(自然科学版), 2009, 49(S1) (自然科学版) : 1249 - 1253
  • [30] IBM GALE中文识别系统
    张世磊
    施勤
    秦勇
    刘文
    CHU StephenM
    KUO Hong-Kwang
    MANGU Lidia
    清华大学学报(自然科学版), 2009, 49 (S1) : 1249 - 1253