THE IBM 2008 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM

被引:9
|
作者
Saon, George [1 ]
Soltau, Hagen [1 ]
Chaudhari, Upendra [1 ]
Chu, Stephen [1 ]
Kingsbury, Brian [1 ]
Kuo, Hong-Kwang [1 ]
Mangu, Lidia [1 ]
Povey, Daniel [2 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
[2] Microsoft Res, Redmond, WA USA
关键词
Speech recognition;
D O I
10.1109/ICASSP.2010.5495640
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes the Arabic broadcast transcription system fielded by IBM in the GALE Phase 3.5 machine translation evaluation. Key advances compared to our Phase 2.5 system include improved discriminative training, the use of Subspace Gaussian Mixture Models (SGMM), neural network acoustic features, variable frame rate decoding, training data partitioning experiments, unpruned n-gram language models and neural network language models. These advances were instrumental in achieving a word error rate of 8.9% on the evaluation test set.
引用
收藏
页码:4378 / 4381
页数:4
相关论文
共 50 条
  • [1] THE IBM 2009 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
    Kingsbury, Brian
    Soltau, Hagen
    Saon, George
    Chu, Stephen
    Kuo, Hong-Kwang
    Mangu, Lidia
    Ravuri, Suman
    Morgan, Nelson
    Janin, Adam
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4672 - 4675
  • [2] Advances in Arabic Speech Transcription at IBM Under the DARPA GALE Program
    Soltau, Hagen
    Saon, George
    Kingsbury, Brian
    Kuo, Hong-Kwang Jeff
    Mangu, Lidia
    Povey, Daniel
    Emami, Ahmad
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (05): : 884 - 894
  • [3] The IBM 2006 gale Arabic ASR system
    Soltau, Hagen
    Saon, George
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 349 - +
  • [4] IBM GALE Mandarin transcription system
    Zhang, Shilei
    Shi, Qin
    Qin, Yong
    Liu, Wen
    Chu, Stephen-M
    Kuo, Hong-Kwang
    Mangu, Lidia
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2009, 49 (SUPPL. 1): : 1249 - 1253
  • [5] THE 2009 IBM GALE MANDARIN BROADCAST TRANSCRIPTION SYSTEM
    Chu, Stephen M.
    Povey, Daniel
    Kuo, Hong-Kwang
    Mangu, Lidia
    Zhang, Shilei
    Shi, Qin
    Qin, Yong
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4374 - 4377
  • [6] Recent advances in the IBM GALE Mandarin transcription system
    Chu, Stephen M.
    Kuo, Rong-kwang
    Mangu, Lidia
    Liu, Ji
    Qin, Yong
    Shi, Qin
    Zhang, Shi Lei
    Aronowitz, Hagai
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4329 - 4332
  • [7] Advances in mandarin broadcast speech transcription at IBM under the DARPA GALE program
    Qin, Yong
    Shi, Qin
    Liu, Yi Y.
    Aronowitz, Hagai
    Chu, Stephen M.
    Kuo, Hong-Kwang
    Zweig, Geoffrey
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 410 - +
  • [8] The IBM BOLT Speech Transcription System
    Thomas, Samuel
    Saon, George
    Kuo, Hong-Kwang
    Mangu, Lidia
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3150 - 3153
  • [9] The IBM mandarin broadcast speech transcription system
    Chu, Stephen M.
    Kuo, Hong-kwang
    Liu, Yi Y.
    Qin, Yong
    Shi, Qin
    Zweig, Geoffrey
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 345 - +
  • [10] The IBM 2007 speech transcription system for European parliamentary speeches
    Ramabhadran, Bhuvana
    Siohan, Olivier
    Sethy, Abhinav
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 472 - +