Cross-language Bootstrapping for Unsupervised Acoustic Model Training: Rapid Development of a Polish Speech Recognition System

被引:0
|
作者
Loeoef, Jonas [1 ]
Gollan, Christian [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Lehrstuhl Informat 6, Dept Comp Sci, Aachen, Germany
关键词
speech recognition; unsupervised training; cross-language bootstrapping;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the rapid development of a Polish language speech recognition system. The system development was performed without access to any transcribed acoustic training data. This was achieved through the combined use of cross-language bootstrapping and confidence based unsupervised acoustic model training. A Spanish acoustic model was ported to Polish, through the use of a manually constructed phoneme mapping. This initial model was refined through iterative recognition and retraining of the untranscribed audio data. The system was trained and evaluated on recordings from the European Parliament, and included several state-of-the-art speech recognition techniques in addition to the use of unsupervised model training. Confidence based speaker adaptive training using features space transform adaptation, as well as vocal tract length normalization and maximum likelihood linear regression, was used to refine the acoustic model. Through the combination of the different techniques, good performance was achieved on the domain of parliamentary speeches.
引用
收藏
页码:96 / 99
页数:4
相关论文
共 50 条
  • [41] High-Order Markov Random Fields and Their Applications in Cross-Language Speech Recognition
    Jiang Zhipeng
    Huang Chengwei
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2015, 15 (04) : 50 - 57
  • [42] Cross-language Speech Attribute Detection and Phone Recognition for Tibetan Using Deep Learning
    Wang, Hui
    Zhao, Yue
    Xu, Yanmin
    Xu, Xiaona
    Suo, Xingmei
    Ji, Qiang
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 474 - +
  • [43] Shared Speech Attribute Augmentation for English-Tibetan Cross-language Phone Recognition
    Zhao, Yue
    Zhou, Nan
    Zhang, Libing
    Wu, Licheng
    Zheng, Rui
    Wang, Xiaoyang
    Ji, Qiang
    2015 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2015, : 539 - 543
  • [44] Linguistic disparities in cross-language automatic speech recognition transfer from Arabic to Tashlhiyt
    Zellou, Georgia
    Lahrouchi, Mohamed
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [45] Linguistic disparities in cross-language automatic speech recognition transfer from Arabic to Tashlhiyt
    Georgia Zellou
    Mohamed Lahrouchi
    Scientific Reports, 14
  • [46] Self Supervised Adversarial Domain Adaptation for Cross-Corpus and Cross-Language Speech Emotion Recognition
    Latif, Siddique
    Rana, Rajib
    Khalifa, Sara
    Jurdak, Raja
    Schuller, Bjorn
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 1912 - 1926
  • [47] UNSUPERVISED ACOUSTIC AND LANGUAGE MODEL TRAINING WITH SMALL AMOUNTS OF LABELLED DATA
    Novotney, Scott
    Schwartz, Richard
    Ma, Jeff
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4297 - 4300
  • [48] ZERO-SHOT PRONUNCIATION LEXICONS FOR CROSS-LANGUAGE ACOUSTIC MODEL TRANSFER
    Wiesner, Matthew
    Adams, Oliver
    Yarowsky, David
    Trmal, Jan
    Khudanpur, Sanjeev
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 1048 - 1054
  • [49] Acoustic model training for speech recognition over mobile networks
    Vojtko, Juraj
    Kacur, Juraj
    Rozinaj, Gregor
    Korosi, Jan
    INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2013, 6 (02) : 65 - 74
  • [50] Cross-Language Transfer Lear ning-based Lhasa-Tibetan Speech Recognition
    Wang, Zhijie
    Zhao, Yue
    Wu, Licheng
    Bi, Xiaojun
    Dawa, Zhuoma
    Ji, Qiang
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (01): : 629 - 639