Transcription System for Semi-Spontaneous Estonian Speech

被引:2
|
作者
Alumaee, Tanel [1 ]
机构
[1] Tallinn Univ Technol, Inst Cybernet, EE-19086 Tallinn, Estonia
关键词
Estonian; speech recognition; compound words; RECOGNITION;
D O I
10.3233/978-1-61499-133-5-10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a speech-to-text system for semi-spontaneous Estonian speech. The system is trained on about 100 hours of manually transcribed speech and a 300M word text corpus. Compound words are split before building the language model and reconstructed from recognizer output using a hidden event N-gram model. We use a three pass transcription strategy with unsupervised speaker adaptation between individual passes. The system achieves a word error rate of 34.6% on conference speeches and 25.6% on radio talk shows.
引用
收藏
页码:10 / 17
页数:8
相关论文
共 50 条
  • [1] Prosodic Characteristics of Semi-Spontaneous Speech
    Bialyk, Olena
    PSYCHOLINGUISTICS, 2018, 23 (02): : 40 - 50
  • [2] Semi-spontaneous Appearances in Hypnosis
    Alrutz, Sydney
    ZEITSCHRIFT FUR PSYCHOLOGIE UND PHYSIOLOGIE DER SINNESORGANE, 1909, 52 : 425 - 460
  • [3] Automatic Analysis of The Prosodic Variations in Parkinsonian Read and Semi-Spontaneous Speech
    De Looze, Celine
    Ghio, Alain
    Scherer, Stefan
    Pouchoulin, Gilles
    Viallet, Francois
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 71 - 74
  • [4] DESCRIBING NON-VOLUNTARY REPETITION OF SEGMENTS IN SEMI-SPONTANEOUS SPEECH
    Montes de Oca, Domingo Roman
    Cofre Vergara, Valeria
    Quezada Gaponov, Camilo
    Matas Crespo, Jose
    Lluma i Fuentes, Jordi
    LOGOS-REVISTA DE LINGUISTICA FILOSOFIA Y LITERATURA, 2011, 21 (02): : 4 - 16
  • [5] Ecologically valid speech collection in behavioral research: The Ghent Semi-spontaneous Speech Paradigm (GSSP)
    Van der Donckt, Jonas
    Kappen, Mitchel
    Degraeve, Vic
    Demuynck, Kris
    Vanderhasselt, Marie-Anne
    Van Hoecke, Sofie
    BEHAVIOR RESEARCH METHODS, 2024, 56 (06) : 5693 - 5708
  • [6] Advanced Rich Transcription System for Estonian Speech
    Alumae, Tanel
    Tilk, Ottokar
    Asadullah
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2018, 2018, 307 : 1 - 8
  • [7] Empty onset repairs in the semi-spontaneous speech of Spanish child and adult heritage speakers
    Repiso-Puigdelliura, Gemma
    INTERNATIONAL JOURNAL OF BILINGUALISM, 2021, 25 (05) : 1311 - 1326
  • [8] Preparation of Bicelles Using the Semi-spontaneous Method
    Watanabe, Yuki
    Aramaki, Kenji
    Kadomatsu, Yuya
    Tanaka, Ken
    Konno, Yoshikazu
    CHEMISTRY LETTERS, 2016, 45 (05) : 558 - 560
  • [9] ANALYSIS OF SEMI-SPONTANEOUS SPEECH BEFORE, DURING AND AFTER AWAKE CRANIOTOMY: A CASE STUDY
    Gommers, E. C.
    Collee, K. E.
    Vincent, A. J. P. E.
    Bos, E. M.
    Dirven, C. M. F.
    Koekkoek, S. K.
    Kruizinga, P.
    Satoer, D. D.
    NEURO-ONCOLOGY, 2022, 24
  • [10] Formulation of bicelles with cholesterol using a semi-spontaneous method
    Kamimoto-Kuroki, Junko
    Yamashita, Mika
    Tanaka, Ken
    Kadomatsu, Yuya
    Tsukamoto, Daisuke
    Aramaki, Kenji
    Adachi, Keita
    Konno, Yoshikazu
    COLLOIDS AND SURFACES A-PHYSICOCHEMICAL AND ENGINEERING ASPECTS, 2020, 606