Advanced Rich Transcription System for Estonian Speech

被引：23

作者：

Alumae, Tanel ^{[1
]}

Tilk, Ottokar ^{[1
]}

Asadullah ^{[1
]}

机构：

[1] Tallinn Univ Technol, Lab Language Technol, Tallinn, Estonia

来源：

HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2018 | 2018年 / 307卷

关键词：

Speech recognition; Estonian; punctuation recovery; speaker identification;

D O I：

10.3233/978-1-61499-912-6-1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes the current TTU speech transcription system for Estonian speech. The system is designed to handle semi-spontaneous speech, such as broadcast conversations, lecture recordings and interviews recorded in diverse acoustic conditions. The system is based on the Kaldi toolkit. Multi-condition training using background noise profiles extracted automatically from untranscribed data is used to improve the robustness of the system. Out-of-vocabulary words are recovered using a phoneme n-gram based decoding subgraph and a FST-based phoneme-to-grapheme model. The system achieves a word error rate of 8.1% on a test set of broadcast conversations. The system also performs punctuation recovery and speaker identification. Speaker identification models are trained using a recently proposed weakly supervised training method.

引用

页码：1 / 8

页数：8

共 50 条

[1] Transcription System for Semi-Spontaneous Estonian Speech
Alumaee, Tanel
HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 : 10 - 17
[2] Estonian Speech Recognition and Transcription Editing Service
Olev, Aivo
Alumae, Tanel
BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (03): : 409 - 421
[3] Open source platform for Estonian speech transcription
Olev, Aivo
Alumae, Tanel
LANGUAGE RESOURCES AND EVALUATION, 2024,
[4] The 2003 ISL rich transcription system for conversational telephony speech
Soltau, H
Yu, H
Metze, F
Fügen, C
Jin, Q
Jou, SC
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 773 - 776
[5] Estonian Large Vocabulary Speech Recognition System for Radiology
Alumaee, Tanel
Meister, Einar
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 33 - 38
[6] The IBM rich transcription spring 2006 Speech-to-Text system for lecture meetings
Huang, Jing
Westphal, Martin
Chen, Stanley
Siohan, Olivier
Povey, Daniel
Libal, Vit
Soneiro, Alvaro
Schulz, Henrik
Ross, Thomas
Potamianos, Gerasimos
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 432 - +
[7] Full-duplex Speech-to-text System for Estonian
Alumaee, Tanel
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014, 2014, 268 : 3 - 10
[8] A mandarin lecture speech transcription system for speech summarization
Chan, Ho Yin
Zhang, Justin Jian
Fung, Pascale
Cao, Lu
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 467 - 471
[9] The AMI system for the transcription of speech in meetings
Hain, Thomas
Burget, Lukas
Dines, John
Garau, Giulia
Karafiat, Martin
Lincoln, Mike
Vepa, Jithendra
Wan, Vincent
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 357 - +
[10] The IBM BOLT Speech Transcription System
Thomas, Samuel
Saon, George
Kuo, Hong-Kwang
Mangu, Lidia
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3150 - 3153

← 1 2 3 4 5 →