The IBM 2006 Speech Transcription System for European Parliamentary Speeches

被引:0
|
作者
Ramabhadran, B. [1 ]
Siohan, O. [1 ]
Mangu, L. [1 ]
Zweig, G. [1 ]
Westphal, M. [2 ]
Schulz, H. [2 ]
Soneiro, A. [2 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
[2] IBM Germany, EMEA Voice Technol Dev, Munich, Germany
关键词
speech recognition; automatic segmentation; cross-adaptation; randomized decision trees; TC-STAR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
TC-STAR is an European Union funded speech to speech translation project to transcribe, translate and synthesize European Parliamentary Plenary Speeches (EPPS). This paper describes IBM's English and Spanish speech recognition systems submitted to the TC-STAR 2006 Evaluation. The technical advances in this submission include two different algorithms for automatic segmentation and speaker clustering of the input audio; a system architecture that is based on cross-adaptation across these two segmentation schemes and system combination through generation of an ensemble of systems using randomized decision tree state-tying; automatic punctuation of the speech recognition output; and the incorporation of an additional 35 hours of in-domain EPPS acoustic training data. These advances reduced the error rate by 30% relative over the best-performing system in the TC-STAR 2005 Evaluation on the 2006 English development test set, and produced one of the best performing systems on the 2006 evaluation in English with a word error rate of 8.3%.
引用
收藏
页码:1225 / +
页数:2
相关论文
共 50 条
  • [1] The IBM 2007 speech transcription system for European parliamentary speeches
    Ramabhadran, Bhuvana
    Siohan, Olivier
    Sethy, Abhinav
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 472 - +
  • [2] The 2006 RWTH Parliamentary Speeches Transcription System
    Loeoef, J.
    Bisani, M.
    Gollan, Ch.
    Heigold, G.
    Hoffmeister, B.
    Plahl, Ch.
    Schlueter, R.
    Ney, H.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 105 - 108
  • [3] The ISL 2007 English Speech Transcription System for European Parliament Speeches
    Stueker, Sebastian
    Fuegen, Christian
    Kraft, Florian
    Woelfel, Matthias
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 673 - 676
  • [4] The IBM BOLT Speech Transcription System
    Thomas, Samuel
    Saon, George
    Kuo, Hong-Kwang
    Mangu, Lidia
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3150 - 3153
  • [5] The IBM rich transcription spring 2006 Speech-to-Text system for lecture meetings
    Huang, Jing
    Westphal, Martin
    Chen, Stanley
    Siohan, Olivier
    Povey, Daniel
    Libal, Vit
    Soneiro, Alvaro
    Schulz, Henrik
    Ross, Thomas
    Potamianos, Gerasimos
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 432 - +
  • [6] The IBM mandarin broadcast speech transcription system
    Chu, Stephen M.
    Kuo, Hong-kwang
    Liu, Yi Y.
    Qin, Yong
    Shi, Qin
    Zweig, Geoffrey
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 345 - +
  • [7] THE IBM 2009 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
    Kingsbury, Brian
    Soltau, Hagen
    Saon, George
    Chu, Stephen
    Kuo, Hong-Kwang
    Mangu, Lidia
    Ravuri, Suman
    Morgan, Nelson
    Janin, Adam
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4672 - 4675
  • [8] THE IBM 2008 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
    Saon, George
    Soltau, Hagen
    Chaudhari, Upendra
    Chu, Stephen
    Kingsbury, Brian
    Kuo, Hong-Kwang
    Mangu, Lidia
    Povey, Daniel
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4378 - 4381
  • [9] IBM REPORTS PROGRESS IN SPEECH RECOGNITION AND TRANSCRIPTION
    MICHALOPOULOS, DA
    COMPUTER, 1980, 13 (09) : 89 - 90
  • [10] Super-Human Multi-Talker Speech Recognition: The IBM 2006 Speech Separation Challenge System
    Kristjansson, T.
    Hershey, J.
    Olsen, P.
    Rennie, S.
    Gopinath, R.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 97 - 100