The IBM 2006 Speech Transcription System for European Parliamentary Speeches

被引：0

作者：

Ramabhadran, B. ^{[1
]}

Siohan, O. ^{[1
]}

Mangu, L. ^{[1
]}

Zweig, G. ^{[1
]}

Westphal, M. ^{[2
]}

Schulz, H. ^{[2
]}

Soneiro, A. ^{[2
]}

机构：

[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA

[2] IBM Germany, EMEA Voice Technol Dev, Munich, Germany

来源：

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年

关键词：

speech recognition; automatic segmentation; cross-adaptation; randomized decision trees; TC-STAR;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

TC-STAR is an European Union funded speech to speech translation project to transcribe, translate and synthesize European Parliamentary Plenary Speeches (EPPS). This paper describes IBM's English and Spanish speech recognition systems submitted to the TC-STAR 2006 Evaluation. The technical advances in this submission include two different algorithms for automatic segmentation and speaker clustering of the input audio; a system architecture that is based on cross-adaptation across these two segmentation schemes and system combination through generation of an ensemble of systems using randomized decision tree state-tying; automatic punctuation of the speech recognition output; and the incorporation of an additional 35 hours of in-domain EPPS acoustic training data. These advances reduced the error rate by 30% relative over the best-performing system in the TC-STAR 2005 Evaluation on the 2006 English development test set, and produced one of the best performing systems on the 2006 evaluation in English with a word error rate of 8.3%.

引用

页码：1225 / +

页数：2

共 50 条

[21] Self-mediatisation and the format of Swedish parliamentary speeches: Speech length and political slogans, 1920-2019
Jarlbrink, Johan
Noren, Fredrik Mohammadi
NORDICOM REVIEW, 2024, 45 (02) : 195 - 216
[22] Advances in mandarin broadcast speech transcription at IBM under the DARPA GALE program
Qin, Yong
Shi, Qin
Liu, Yi Y.
Aronowitz, Hagai
Chu, Stephen M.
Kuo, Hong-Kwang
Zweig, Geoffrey
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 410 - +
[23] The IBM rich transcription 2007 speech-to-text systems for lecture meetings
Huang, Jing
Marcheret, Etienne
Visweswariah, Karthik
Libal, Vit
Potamianos, Gerasimos
MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 429 - 441
[24] Recent improvements to the IBM trainable speech synthesis system
Eide, E
Aaron, A
Bakis, R
Cohen, P
Donovan, R
Hamza, W
Mathes, T
Picheny, M
Polkosky, M
Smith, M
Viswanathan, M
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 708 - 711
[25] A mandarin lecture speech transcription system for speech summarization
Chan, Ho Yin
Zhang, Justin Jian
Fung, Pascale
Cao, Lu
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 467 - 471
[26] The IBM 2004 conversational telephony system for rich transcription
Soltau, H
Kingsbury, B
Mangu, L
Povey, D
Saon, G
Zweig, G
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 205 - 208
[27] ORIGIN OF MINISTERIAL PRESIDENTIAL COUNCIL IN EUROPEAN PARLIAMENTARY SYSTEM
BEYME, KV
POLITISCHE VIERTELJAHRESSCHRIFT, 1969, 10 (2-3) : 249 - 268
[28] THE 2009 IBM GALE MANDARIN BROADCAST TRANSCRIPTION SYSTEM
Chu, Stephen M.
Povey, Daniel
Kuo, Hong-Kwang
Mangu, Lidia
Zhang, Shilei
Shi, Qin
Qin, Yong
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4374 - 4377
[29] Recent advances in the IBM GALE Mandarin transcription system
Chu, Stephen M.
Kuo, Rong-kwang
Mangu, Lidia
Liu, Ji
Qin, Yong
Shi, Qin
Zhang, Shi Lei
Aronowitz, Hagai
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4329 - 4332
[30] The AMI system for the transcription of speech in meetings
Hain, Thomas
Burget, Lukas
Dines, John
Garau, Giulia
Karafiat, Martin
Lincoln, Mike
Vepa, Jithendra
Wan, Vincent
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 357 - +

← 1 2 3 4 5 →