Cross-language Bootstrapping for Unsupervised Acoustic Model Training: Rapid Development of a Polish Speech Recognition System

被引:0
|
作者
Loeoef, Jonas [1 ]
Gollan, Christian [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Lehrstuhl Informat 6, Dept Comp Sci, Aachen, Germany
关键词
speech recognition; unsupervised training; cross-language bootstrapping;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the rapid development of a Polish language speech recognition system. The system development was performed without access to any transcribed acoustic training data. This was achieved through the combined use of cross-language bootstrapping and confidence based unsupervised acoustic model training. A Spanish acoustic model was ported to Polish, through the use of a manually constructed phoneme mapping. This initial model was refined through iterative recognition and retraining of the untranscribed audio data. The system was trained and evaluated on recordings from the European Parliament, and included several state-of-the-art speech recognition techniques in addition to the use of unsupervised model training. Confidence based speaker adaptive training using features space transform adaptation, as well as vocal tract length normalization and maximum likelihood linear regression, was used to refine the acoustic model. Through the combination of the different techniques, good performance was achieved on the domain of parliamentary speeches.
引用
收藏
页码:96 / 99
页数:4
相关论文
共 50 条
  • [31] Grammar based automatic speech recognition system for the Polish language
    Korzinek, Danijel
    Brocki, Lukasz
    RECENT ADVANCES IN MECHATRONICS, 2007, : 87 - +
  • [32] SARMATA 2.0 Automatic Polish Language Speech Recognition System
    Ziolko, Bartosz
    Jadczyk, Tomasz
    Skurzok, Dawid
    Zelasko, Piotr
    Galka, Jakub
    Pedzimaz, Tomasz
    Gawlik, Ireneusz
    Palka, Szymon
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1062 - +
  • [33] Speech variability: A cross-language study on acoustic variations of speaking versus untrained singing
    Hansen, John H. L.
    Bokshi, Marigona
    Khorram, Soheil
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 148 (02): : 829 - 844
  • [34] Privacy Preserving Acoustic Model Training for Speech Recognition
    Tachioka, Yuuki
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 627 - 631
  • [35] LANGUAGE MODEL BOOTSTRAPPING USING NEURAL MACHINE TRANSLATION FOR CONVERSATIONAL SPEECH RECOGNITION
    Punjabi, Surabhi
    Arsikere, Harish
    Garimella, Sri
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 487 - 493
  • [36] A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals With Parkinson's Disease
    Kim, Yunjung
    Choi, Yaelin
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2017, 60 (09): : 2506 - 2518
  • [37] Speech recognition based on unified model of acoustic and language aspects of speech
    1600, Nippon Telegraph and Telephone Corp. (11):
  • [38] Development of Hausa Acoustic Model for Speech Recognition
    Ibrahim, Umar Adam
    Boukar, Moussa Mahamat
    Suleiman, Muhammad Aliyu
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (05) : 503 - 508
  • [39] Multi-model fusion framework based on multi-input cross-language emotional speech recognition
    Hu, Guohua
    Zhao, Qingshan
    International Journal of Wireless and Mobile Computing, 2021, 20 (01): : 32 - 40
  • [40] A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian
    Popovic, Branislav
    Pakoci, Edvin
    Pekar, Darko
    SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 522 - 531