Advanced Rich Transcription System for Estonian Speech

被引：23

作者：

Alumae, Tanel ^{[1
]}

Tilk, Ottokar ^{[1
]}

Asadullah ^{[1
]}

机构：

[1] Tallinn Univ Technol, Lab Language Technol, Tallinn, Estonia

来源：

HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2018 | 2018年 / 307卷

关键词：

Speech recognition; Estonian; punctuation recovery; speaker identification;

D O I：

10.3233/978-1-61499-912-6-1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes the current TTU speech transcription system for Estonian speech. The system is designed to handle semi-spontaneous speech, such as broadcast conversations, lecture recordings and interviews recorded in diverse acoustic conditions. The system is based on the Kaldi toolkit. Multi-condition training using background noise profiles extracted automatically from untranscribed data is used to improve the robustness of the system. Out-of-vocabulary words are recovered using a phoneme n-gram based decoding subgraph and a FST-based phoneme-to-grapheme model. The system achieves a word error rate of 8.1% on a test set of broadcast conversations. The system also performs punctuation recovery and speaker identification. Speaker identification models are trained using a recently proposed weakly supervised training method.

引用

页码：1 / 8

页数：8

共 50 条

[41] PHONEMIA A PHONEME TRANSCRIPTION SYSTEM FOR SPEECH SYNTHESIS IN MODERN GREEK
BAKAMIDIS, S
CARAYANNIS, G
SPEECH COMMUNICATION, 1987, 6 (02) : 159 - 169
[42] The IBM 2006 Speech Transcription System for European Parliamentary Speeches
Ramabhadran, B.
Siohan, O.
Mangu, L.
Zweig, G.
Westphal, M.
Schulz, H.
Soneiro, A.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1225 - +
[43] Development of Transcription System Using Speech Recognition for Program Production
Mishima T.
Hagiwara A.
Ito H.
Komori T.
Horikawa D.
Kawase N.
Sato S.
1600, Inst. of Image Information and Television Engineers (74): : 729 - 735
[44] Prεεch: A System for Privacy-Preserving Speech Transcription
Ahmed, Shimaa
Chowdhury, Amrita Roy
Fawaz, Kassem
Ramanathan, Parmesh
PROCEEDINGS OF THE 29TH USENIX SECURITY SYMPOSIUM, 2020, : 2703 - 2720
[45] STATISTICAL LANGUAGE MODEL ADAPTATION FOR ESTONIAN SPEECH RECOGNITION
Alumaee, Tanel
EESTI RAKENDUSLINGVISTIKA UHINGU AASTARAAMAT, 2008, 4 : 5 - 16
[46] DEVELOPMENTAL CHANGES IN ACOUSTIC CHARACTERISTICS OF SPEECH OF ESTONIAN ADOLESCENTS
Meister, Einar
Meister, Lya
FOLKLORE-ELECTRONIC JOURNAL OF FOLKLORE, 2023, (90) : 179 - 206
[47] Shallow Parsing of Transcribed Speech of Estonian and Disfluency Detection
Muurisep, Kaili
Nigol, Helen
HUMAN LANGUAGE TECHNOLOGY: CHALLENGES OF THE INFORMATION SOCIETY, 2009, 5603 : 165 - +
[48] Modelling speech temporal structure for Estonian text-to-speech synthesis: Feature selection
Mihkla, Meelis
TRAMES-JOURNAL OF THE HUMANITIES AND SOCIAL SCIENCES, 2007, 11 (03): : 284 - 298
[49] Creation of HMM-based Speech Model for Estonian Text-to-Speech Synthesis
Nurk, Tonis
HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 : 162 - 168
[50] Optimizing the perception of soft speech and speech in noise with the Advanced Bionics cochlear implant system
Holden, Laura K.
Reeder, Ruth M.
Firszt, Jill B.
Finley, Charles C.
INTERNATIONAL JOURNAL OF AUDIOLOGY, 2011, 50 (04) : 255 - 269

← 1 2 3 4 5 →