The 2005 AMI system for the transcription of speech in meetings

被引:0
|
作者
Hain, T [1 ]
Burget, L
Dines, J
Garau, G
Karafiat, M
Lincoln, M
McCowan, I
Moore, D
Wan, V
Ordelman, R
Renals, S
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Brno Univ Technol, Fac Informat Engn, Brno 61266, Czech Republic
[3] IDIAP Res Inst, CH-1920 Martigny, Switzerland
[4] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9LW, Midlothian, Scotland
[5] Univ Twente, Dept Elect Engn, NL-7500 AE Enschede, Netherlands
来源
MACHINE LEARNING FOR MULTIMODAL INTERACTION | 2005年 / 3869卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we describe the 2005 AMI system for the transcription of speech in meetings used in the 2005 NIST RT evaluations. The system was designed for participation in the speech to text part of the evaluations, in particular for transcription of speech recorded with multiple distant microphones and independent headset microphones. System performance was tested on both conference room and lecture style meetings. Although input sources are processed using different front-ends, the recognition process is based on a unified system architecture. The system operates in multiple passes and makes use of state of the art technologies such as discriminative training, vocal tract length normalisation, heteroscedastic linear discriminant analysis, speaker adaptation with maximum likelihood linear regression and minimum word error rate decoding. In this paper we describe the system performance on the official development and test sets for the NIST RT05s evaluations. The system was jointly developed in less than 10 months by a multi-site team and was shown to achieve competitive performance.
引用
收藏
页码:450 / 462
页数:13
相关论文
共 50 条
  • [31] THE IBM 2008 GALE ARABIC SPEECH TRANSCRIPTION SYSTEM
    Saon, George
    Soltau, Hagen
    Chaudhari, Upendra
    Chu, Stephen
    Kingsbury, Brian
    Kuo, Hong-Kwang
    Mangu, Lidia
    Povey, Daniel
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4378 - 4381
  • [32] The 1998 HTK system for transcription of conversational telephone speech
    Hain, T
    Woodland, PC
    Niesler, TR
    Whittaker, EWD
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 57 - 60
  • [33] The risks of speech: Meetings with Oswald Ducrot
    Kirst De Souza, Alexandre
    Dall Cortivo Lebler, Cristiane
    ENTREPALAVRAS, 2019, 9 (02): : 534 - 540
  • [34] Microphone array speech recognition: Experiments on overlapping speech in meetings
    Moore, DC
    McCowan, IA
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 497 - 500
  • [35] IGAMSAI - Idea Generation with Autonomous multimedia gathering for Meetings Summaries in AmI
    Freitas, Carlos Filipe
    Meireles, Antonio
    Figueiredo, Lino
    Ramos, Carlos
    WORKSHOPS PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS, 2009, 4 : 289 - 296
  • [36] Robust speaker segmentation for meetings:: The ICSI-SRI Spring 2005 Diarization System
    Anguera, X
    Wooters, C
    Peskin, B
    Aguiló, M
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 402 - 414
  • [37] CRIM'S FRENCH SPEECH TRANSCRIPTION SYSTEM FOR ETAPE 2011
    Gupta, Vishwa
    Boulianne, Gilles
    Osterrath, Frederic
    Ouellet, Pierre
    2013 8TH INTERNATIONAL WORKSHOP ON SYSTEMS, SIGNAL PROCESSING AND THEIR APPLICATIONS (WOSSPA), 2013, : 351 - 356
  • [38] The 2003 ISL rich transcription system for conversational telephony speech
    Soltau, H
    Yu, H
    Metze, F
    Fügen, C
    Jin, Q
    Jou, SC
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 773 - 776
  • [39] The IBM 2007 speech transcription system for European parliamentary speeches
    Ramabhadran, Bhuvana
    Siohan, Olivier
    Sethy, Abhinav
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 472 - +
  • [40] System for speech transcription and post-editing in Microsoft Word
    Salimbajevs, Askars
    Ikauniece, Indra
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 825 - 826