The IBM BOLT Speech Transcription System

被引：0

作者：

Thomas, Samuel ^{[1
]}

Saon, George ^{[1
]}

Kuo, Hong-Kwang ^{[1
]}

Mangu, Lidia ^{[1
]}

机构：

[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

Automatic speech recognition; conversational telephone speech; deep neural networks; machine translation;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We describe the IBM automatic speech recognition (ASR) system for the DARPA Broad Operational Language Translation (BOLT) program. The system is used to transcribe conversational telephone speech (CTS) prior to machine translation for Phase 3 of the program's Activity A. The ASR system is a combination of novel sequence trained ensemble deep neural network acoustic models on speaker adapted features and convolutional neural network models on two kinds of spectro-temporal representations of speech, in conjunction with a variety of class, neural network and n-gram based language models. Acoustic and language models for the recognition system are built on transcribed audio released under the program and further optimized for the final machine translation task as well. The evaluation system has a word error rate of 32.7% on a 2 hour Egyptian Arabic development set for this task.

引用

页码：3150 / 3153

页数：4

共 50 条

[1] The IBM mandarin broadcast speech transcription system
Chu, Stephen M.
Kuo, Hong-kwang
Liu, Yi Y.
Qin, Yong
Shi, Qin
Zweig, Geoffrey
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 345 - +
[2] THE IBM 2009 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
Kingsbury, Brian
Soltau, Hagen
Saon, George
Chu, Stephen
Kuo, Hong-Kwang
Mangu, Lidia
Ravuri, Suman
Morgan, Nelson
Janin, Adam
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4672 - 4675
[3] THE IBM 2008 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
Saon, George
Soltau, Hagen
Chaudhari, Upendra
Chu, Stephen
Kingsbury, Brian
Kuo, Hong-Kwang
Mangu, Lidia
Povey, Daniel
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4378 - 4381
[4] The IBM 2007 speech transcription system for European parliamentary speeches
Ramabhadran, Bhuvana
Siohan, Olivier
Sethy, Abhinav
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 472 - +
[5] The IBM 2006 Speech Transcription System for European Parliamentary Speeches
Ramabhadran, B.
Siohan, O.
Mangu, L.
Zweig, G.
Westphal, M.
Schulz, H.
Soneiro, A.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1225 - +
[6] IBM REPORTS PROGRESS IN SPEECH RECOGNITION AND TRANSCRIPTION
MICHALOPOULOS, DA
COMPUTER, 1980, 13 (09) : 89 - 90
[7] Recent improvements to IBM's speech recognition system for automatic transcription of broadcast news
Chen, SS
Eide, EM
Gales, MJF
Gopinath, RA
Kanevsky, D
Olsen, P
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 37 - 40
[8] Recent improvements to IBM's speech recognition system for automatic transcription of broadcast news
Chen, S.S.
Eide, E.M.
Gales, M.J.F.
Gopinath, R.A.
Kanevsky, D.
Olsen, P.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 1 : 37 - 40
[9] The IBM rich transcription spring 2006 Speech-to-Text system for lecture meetings
Huang, Jing
Westphal, Martin
Chen, Stanley
Siohan, Olivier
Povey, Daniel
Libal, Vit
Soneiro, Alvaro
Schulz, Henrik
Ross, Thomas
Potamianos, Gerasimos
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 432 - +
[10] IBM GALE Mandarin transcription system
Zhang, Shilei
Shi, Qin
Qin, Yong
Liu, Wen
Chu, Stephen-M
Kuo, Hong-Kwang
Mangu, Lidia
Qinghua Daxue Xuebao/Journal of Tsinghua University, 2009, 49 (SUPPL. 1): : 1249 - 1253

← 1 2 3 4 5 →