The 2005 AMI system for the transcription of speech in meetings

被引:0
|
作者
Hain, T [1 ]
Burget, L
Dines, J
Garau, G
Karafiat, M
Lincoln, M
McCowan, I
Moore, D
Wan, V
Ordelman, R
Renals, S
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Brno Univ Technol, Fac Informat Engn, Brno 61266, Czech Republic
[3] IDIAP Res Inst, CH-1920 Martigny, Switzerland
[4] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9LW, Midlothian, Scotland
[5] Univ Twente, Dept Elect Engn, NL-7500 AE Enschede, Netherlands
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we describe the 2005 AMI system for the transcription of speech in meetings used in the 2005 NIST RT evaluations. The system was designed for participation in the speech to text part of the evaluations, in particular for transcription of speech recorded with multiple distant microphones and independent headset microphones. System performance was tested on both conference room and lecture style meetings. Although input sources are processed using different front-ends, the recognition process is based on a unified system architecture. The system operates in multiple passes and makes use of state of the art technologies such as discriminative training, vocal tract length normalisation, heteroscedastic linear discriminant analysis, speaker adaptation with maximum likelihood linear regression and minimum word error rate decoding. In this paper we describe the system performance on the official development and test sets for the NIST RT05s evaluations. The system was jointly developed in less than 10 months by a multi-site team and was shown to achieve competitive performance.
引用
收藏
页码:450 / 462
页数:13
相关论文
共 50 条
  • [1] The AMI system for the transcription of speech in meetings
    Hain, Thomas
    Burget, Lukas
    Dines, John
    Garau, Giulia
    Karafiat, Martin
    Lincoln, Mike
    Vepa, Jithendra
    Wan, Vincent
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 357 - +
  • [2] The development of the AMI system for the transcription of speech in meetings
    Hain, T
    Burget, L
    Dines, J
    McCowan, I
    Garau, G
    Karafiat, M
    Lincoln, M
    Moore, D
    Wan, V
    Ordelman, R
    Renals, S
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 344 - 356
  • [3] Exploring speech retrieval from meetings using the AMI corpus
    Eskevich, Maria
    Jones, Gareth J. F.
    COMPUTER SPEECH AND LANGUAGE, 2014, 28 (05): : 1021 - 1044
  • [4] The IBM rich transcription spring 2006 Speech-to-Text system for lecture meetings
    Huang, Jing
    Westphal, Martin
    Chen, Stanley
    Siohan, Olivier
    Povey, Daniel
    Libal, Vit
    Soneiro, Alvaro
    Schulz, Henrik
    Ross, Thomas
    Potamianos, Gerasimos
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 432 - +
  • [5] The 2007 AMI(DA) system for meeting transcription
    Hain, Thomas
    Burget, Lukas
    Dines, John
    Garau, Giulia
    Karafiat, Martin
    van Leeuwen, David
    Lincoln, Mike
    Wan, Vincent
    MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 414 - +
  • [6] The AMI meeting transcription system: Progress and performance
    Hain, Thomas
    Burget, Lukas
    Dines, John
    Garau, Giulia
    Karafiat, Martin
    Lincoln, Mike
    Vepal, Jithendra
    Wan, Vincent
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 419 - +
  • [7] Patterns of speech activity at ETUC Executive Committee meetings, 2005-2012
    Furaker, Bengt
    Selden, Kristina Loven
    EUROPEAN JOURNAL OF INDUSTRIAL RELATIONS, 2016, 22 (01) : 57 - 71
  • [8] The IBM rich transcription 2007 speech-to-text systems for lecture meetings
    Huang, Jing
    Marcheret, Etienne
    Visweswariah, Karthik
    Libal, Vit
    Potamianos, Gerasimos
    MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 429 - 441
  • [9] Automatic Transcription System for Meetings of the Japanese National Congress
    Akita, Yuya
    Mimura, Masato
    Kawahara, Tatsuya
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 80 - 83
  • [10] A mandarin lecture speech transcription system for speech summarization
    Chan, Ho Yin
    Zhang, Justin Jian
    Fung, Pascale
    Cao, Lu
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 467 - 471