The IBM BOLT Speech Transcription System

被引：0

作者：

Thomas, Samuel ^{[1
]}

Saon, George ^{[1
]}

Kuo, Hong-Kwang ^{[1
]}

Mangu, Lidia ^{[1
]}

机构：

[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

Automatic speech recognition; conversational telephone speech; deep neural networks; machine translation;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We describe the IBM automatic speech recognition (ASR) system for the DARPA Broad Operational Language Translation (BOLT) program. The system is used to transcribe conversational telephone speech (CTS) prior to machine translation for Phase 3 of the program's Activity A. The ASR system is a combination of novel sequence trained ensemble deep neural network acoustic models on speaker adapted features and convolutional neural network models on two kinds of spectro-temporal representations of speech, in conjunction with a variety of class, neural network and n-gram based language models. Acoustic and language models for the recognition system are built on transcribed audio released under the program and further optimized for the final machine translation task as well. The evaluation system has a word error rate of 32.7% on a 2 hour Egyptian Arabic development set for this task.

引用

页码：3150 / 3153

页数：4

共 50 条

[21] The AMI system for the transcription of speech in meetings
Hain, Thomas
Burget, Lukas
Dines, John
Garau, Giulia
Karafiat, Martin
Lincoln, Mike
Vepa, Jithendra
Wan, Vincent
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 357 - +
[22] Developing high performance ASR in the IBM multilingual speech-to-speech translation system
Cui, Xiaodong
Gu, Liang
Xiang, Bing
Zhang, Wei
Gao, Yuqing
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5121 - 5124
[23] The IBM Speech Activity Detection System for the DARPA RATS Program
Saon, George
Thomas, Samuel
Soltau, Hagen
Ganapathy, Sriram
Kingsbury, Brian
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3464 - 3468
[24] ESTIMATION OF PROBABILITIES IN THE LANGUAGE MODEL OF THE IBM SPEECH RECOGNITION SYSTEM
NADAS, A
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (04): : 859 - 861
[25] The IBM 2015 English Conversational Telephone Speech Recognition System
Saon, George
Kuo, Hong-Kwang J.
Rennie, Steven
Picheny, Michael
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3140 - 3144
[26] Recent Advances of IBM's Handheld Speech Translation System
Zhu, Weizhong
Zhou, Bowen
Prosser, Charles
Krbec, Pavel
Gao, Yuqing
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1181 - 1184
[27] The IBM 2016 English Conversational Telephone Speech Recognition System
Saon, George
Sercu, Tom
Rennie, Steven
Kuo, Hong-Kwang J.
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 7 - 11
[28] The 2005 AMI system for the transcription of speech in meetings
Hain, T
Burget, L
Dines, J
Garau, G
Karafiat, M
Lincoln, M
McCowan, I
Moore, D
Wan, V
Ordelman, R
Renals, S
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 450 - 462
[29] Advanced Rich Transcription System for Estonian Speech
Alumae, Tanel
Tilk, Ottokar
Asadullah
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2018, 2018, 307 : 1 - 8
[30] An Automatic Speech Transcription System for Manipuri Language
Patel, Tanvina
Krishna, D. N.
Fathima, Noor
Shah, Nisar
Mahima, C.
Kumar, Deepak
Iyengar, Anuroop
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2388 - 2389

← 1 2 3 4 5 →