Transcription System for Semi-Spontaneous Estonian Speech

被引：2

作者：

Alumaee, Tanel ^{[1
]}

机构：

[1] Tallinn Univ Technol, Inst Cybernet, EE-19086 Tallinn, Estonia

来源：

HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE | 2012年 / 247卷

关键词：

Estonian; speech recognition; compound words; RECOGNITION;

D O I：

10.3233/978-1-61499-133-5-10

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes a speech-to-text system for semi-spontaneous Estonian speech. The system is trained on about 100 hours of manually transcribed speech and a 300M word text corpus. Compound words are split before building the language model and reconstructed from recognizer output using a hidden event N-gram model. We use a three pass transcription strategy with unsupervised speaker adaptation between individual passes. The system achieves a word error rate of 34.6% on conference speeches and 25.6% on radio talk shows.

引用

页码：10 / 17

页数：8

共 50 条

[21] Multimodal Prominence Marking in Semi-Spontaneous YouTube Monologs: The Interaction of Intonation and Eyebrow Movements
Berger, Stephanie
Zellers, Margaret
FRONTIERS IN COMMUNICATION, 2022, 7
[22] Automatic transcription of spontaneous lecture speech
Kawahara, T
Nanjo, H
Furui, S
ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 186 - 189
[23] L1 Prosodic transfer and priming effects: A quantitative study on semi-spontaneous dialogues
Turco, Giuseppina
Gubian, Michele
PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 386 - 389
[24] Semi-Spontaneous Post-Crosslinking Triblock Copolymer Electrolyte for Solid-State Lithium Battery
Zheng, Zhenan
Huang, Jie
Gao, Xiang
Luo, Yingwu
BATTERIES-BASEL, 2023, 9 (09):
[25] Estonian Large Vocabulary Speech Recognition System for Radiology
Alumaee, Tanel
Meister, Einar
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 33 - 38
[26] Manual vs assisted transcription of prepared and spontaneous speech
Bazillon, Thierry
Esteve, Yannick
Luzzati, Daniel
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1067 - 1071
[27] Prosodic Patterns of Estonian words: a Corpus-Based Description Using Spontaneous Speech
Nemoto, Rena
Adda-Decker, Martine
HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 : 286 - +
[28] Full-duplex Speech-to-text System for Estonian
Alumaee, Tanel
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014, 2014, 268 : 3 - 10
[29] A mandarin lecture speech transcription system for speech summarization
Chan, Ho Yin
Zhang, Justin Jian
Fung, Pascale
Cao, Lu
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 467 - 471
[30] Towards automatic transcription of spontaneous Czech speech in the MALACH project
Psutka, J
Ircing, P
Psutka, JV
Radová, V
Byrne, W
Venkataramani, V
Hajic, J
Gustman, S
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 214 - 219

← 1 2 3 4 5 →