The 2005 AMI system for the transcription of speech in meetings

被引：0

作者：

Hain, T ^{[1
]}

Burget, L

Dines, J

Garau, G

Karafiat, M

Lincoln, M

McCowan, I

Moore, D

Wan, V

Ordelman, R

Renals, S

机构：

[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England

[2] Brno Univ Technol, Fac Informat Engn, Brno 61266, Czech Republic

[3] IDIAP Res Inst, CH-1920 Martigny, Switzerland

[4] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9LW, Midlothian, Scotland

[5] Univ Twente, Dept Elect Engn, NL-7500 AE Enschede, Netherlands

来源：

MACHINE LEARNING FOR MULTIMODAL INTERACTION | 2005年 / 3869卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we describe the 2005 AMI system for the transcription of speech in meetings used in the 2005 NIST RT evaluations. The system was designed for participation in the speech to text part of the evaluations, in particular for transcription of speech recorded with multiple distant microphones and independent headset microphones. System performance was tested on both conference room and lecture style meetings. Although input sources are processed using different front-ends, the recognition process is based on a unified system architecture. The system operates in multiple passes and makes use of state of the art technologies such as discriminative training, vocal tract length normalisation, heteroscedastic linear discriminant analysis, speaker adaptation with maximum likelihood linear regression and minimum word error rate decoding. In this paper we describe the system performance on the official development and test sets for the NIST RT05s evaluations. The system was jointly developed in less than 10 months by a multi-site team and was shown to achieve competitive performance.

引用

页码：450 / 462

页数：13

共 50 条

[1] The AMI system for the transcription of speech in meetings
Hain, Thomas
Burget, Lukas
Dines, John
Garau, Giulia
Karafiat, Martin
Lincoln, Mike
Vepa, Jithendra
Wan, Vincent
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 357 - +
[2] The development of the AMI system for the transcription of speech in meetings
Hain, T
Burget, L
Dines, J
McCowan, I
Garau, G
Karafiat, M
Lincoln, M
Moore, D
Wan, V
Ordelman, R
Renals, S
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 344 - 356
[3] Exploring speech retrieval from meetings using the AMI corpus
Eskevich, Maria
Jones, Gareth J. F.
COMPUTER SPEECH AND LANGUAGE, 2014, 28 (05): : 1021 - 1044
[4] The IBM rich transcription spring 2006 Speech-to-Text system for lecture meetings
Huang, Jing
Westphal, Martin
Chen, Stanley
Siohan, Olivier
Povey, Daniel
Libal, Vit
Soneiro, Alvaro
Schulz, Henrik
Ross, Thomas
Potamianos, Gerasimos
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 432 - +
[5] The 2007 AMI(DA) system for meeting transcription
Hain, Thomas
Burget, Lukas
Dines, John
Garau, Giulia
Karafiat, Martin
van Leeuwen, David
Lincoln, Mike
Wan, Vincent
MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 414 - +
[6] The AMI meeting transcription system: Progress and performance
Hain, Thomas
Burget, Lukas
Dines, John
Garau, Giulia
Karafiat, Martin
Lincoln, Mike
Vepal, Jithendra
Wan, Vincent
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 419 - +
[7] Patterns of speech activity at ETUC Executive Committee meetings, 2005-2012
Furaker, Bengt
Selden, Kristina Loven
EUROPEAN JOURNAL OF INDUSTRIAL RELATIONS, 2016, 22 (01) : 57 - 71
[8] The IBM rich transcription 2007 speech-to-text systems for lecture meetings
Huang, Jing
Marcheret, Etienne
Visweswariah, Karthik
Libal, Vit
Potamianos, Gerasimos
MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 429 - 441
[9] Automatic Transcription System for Meetings of the Japanese National Congress
Akita, Yuya
Mimura, Masato
Kawahara, Tatsuya
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 80 - 83
[10] A mandarin lecture speech transcription system for speech summarization
Chan, Ho Yin
Zhang, Justin Jian
Fung, Pascale
Cao, Lu
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 467 - 471

← 1 2 3 4 5 →