Transcription System for Semi-Spontaneous Estonian Speech

被引:2
|
作者
Alumaee, Tanel [1 ]
机构
[1] Tallinn Univ Technol, Inst Cybernet, EE-19086 Tallinn, Estonia
关键词
Estonian; speech recognition; compound words; RECOGNITION;
D O I
10.3233/978-1-61499-133-5-10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a speech-to-text system for semi-spontaneous Estonian speech. The system is trained on about 100 hours of manually transcribed speech and a 300M word text corpus. Compound words are split before building the language model and reconstructed from recognizer output using a hidden event N-gram model. We use a three pass transcription strategy with unsupervised speaker adaptation between individual passes. The system achieves a word error rate of 34.6% on conference speeches and 25.6% on radio talk shows.
引用
收藏
页码:10 / 17
页数:8
相关论文
共 50 条
  • [21] Multimodal Prominence Marking in Semi-Spontaneous YouTube Monologs: The Interaction of Intonation and Eyebrow Movements
    Berger, Stephanie
    Zellers, Margaret
    FRONTIERS IN COMMUNICATION, 2022, 7
  • [22] Automatic transcription of spontaneous lecture speech
    Kawahara, T
    Nanjo, H
    Furui, S
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 186 - 189
  • [23] L1 Prosodic transfer and priming effects: A quantitative study on semi-spontaneous dialogues
    Turco, Giuseppina
    Gubian, Michele
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 386 - 389
  • [24] Semi-Spontaneous Post-Crosslinking Triblock Copolymer Electrolyte for Solid-State Lithium Battery
    Zheng, Zhenan
    Huang, Jie
    Gao, Xiang
    Luo, Yingwu
    BATTERIES-BASEL, 2023, 9 (09):
  • [25] Estonian Large Vocabulary Speech Recognition System for Radiology
    Alumaee, Tanel
    Meister, Einar
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 33 - 38
  • [26] Manual vs assisted transcription of prepared and spontaneous speech
    Bazillon, Thierry
    Esteve, Yannick
    Luzzati, Daniel
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1067 - 1071
  • [27] Prosodic Patterns of Estonian words: a Corpus-Based Description Using Spontaneous Speech
    Nemoto, Rena
    Adda-Decker, Martine
    HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 : 286 - +
  • [28] Full-duplex Speech-to-text System for Estonian
    Alumaee, Tanel
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014, 2014, 268 : 3 - 10
  • [29] A mandarin lecture speech transcription system for speech summarization
    Chan, Ho Yin
    Zhang, Justin Jian
    Fung, Pascale
    Cao, Lu
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 467 - 471
  • [30] Towards automatic transcription of spontaneous Czech speech in the MALACH project
    Psutka, J
    Ircing, P
    Psutka, JV
    Radová, V
    Byrne, W
    Venkataramani, V
    Hajic, J
    Gustman, S
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 214 - 219