Development of Language Models for Continuous Uzbek Speech Recognition System

被引:6
|
作者
Mukhamadiyev, Abdinabi [1 ]
Mukhiddinov, Mukhriddin [1 ]
Khujayarov, Ilyos [2 ]
Ochilov, Mannon [3 ]
Cho, Jinsoo [1 ]
机构
[1] Gachon Univ, Dept Comp Engn, Seongnam Si 13120, South Korea
[2] Tashkent Univ Informat Technol, Dept Informat Technol, Samarkand Branch, Tashkent 140100, Uzbekistan
[3] Tashkent Univ Informat Technol, Dept Artificial Intelligence, Tashkent 100200, Uzbekistan
基金
新加坡国家研究基金会;
关键词
language model; Uzbek speech; recurrent neural networks; automatic speech recognition; neural networks; character-based language models; word-based language models;
D O I
10.3390/s23031145
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Automatic speech recognition systems with a large vocabulary and other natural language processing applications cannot operate without a language model. Most studies on pre-trained language models have focused on more popular languages such as English, Chinese, and various European languages, but there is no publicly available Uzbek speech dataset. Therefore, language models of low-resource languages need to be studied and created. The objective of this study is to address this limitation by developing a low-resource language model for the Uzbek language and understanding linguistic occurrences. We proposed the Uzbek language model named UzLM by examining the performance of statistical and neural-network-based language models that account for the unique features of the Uzbek language. Our Uzbek-specific linguistic representation allows us to construct more robust UzLM, utilizing 80 million words from various sources while using the same or fewer training words, as applied in previous studies. Roughly sixty-eight thousand different words and 15 million sentences were collected for the creation of this corpus. The experimental results of our tests on the continuous recognition of Uzbek speech show that, compared with manual encoding, the use of neural-network-based language models reduced the character error rate to 5.26%.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] DNN based continuous speech recognition system of Punjabi language on Kaldi toolkit
    Jyoti Guglani
    A. N. Mishra
    International Journal of Speech Technology, 2021, 24 : 41 - 45
  • [22] A study of neural network Russian language models for automatic continuous speech recognition systems
    Kipyatkova, I. S.
    Karpov, A. A.
    AUTOMATION AND REMOTE CONTROL, 2017, 78 (05) : 858 - 867
  • [23] A study of neural network Russian language models for automatic continuous speech recognition systems
    I. S. Kipyatkova
    A. A. Karpov
    Automation and Remote Control, 2017, 78 : 858 - 867
  • [24] Development of Continuous Automatic Speech Recognition System for Controlling of MAVs through Natural Speech
    Lakshmi, Sandhya R.
    Veena, S.
    Reddy, Roja B.
    2017 INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN ELECTRONICS AND COMMUNICATION TECHNOLOGY (ICRAECT), 2017, : 144 - 148
  • [25] PHONEME-BASED CONTINUOUS SPEECH RECOGNITION RESULTS FOR DIFFERENT LANGUAGE MODELS IN THE 1000-WORD SPICOS SYSTEM
    NEY, H
    PAESELER, A
    SPEECH COMMUNICATION, 1988, 7 (04) : 367 - 374
  • [26] Gaussian mixture clustering and language adaptation for the development of a new language speech recognition system
    Chatzichrisafis, Nikos
    Diakoloukas, Vassilios
    Digalakis, Vassilios
    Harizakis, Costas
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 928 - 938
  • [27] MODELS OF CONTINUOUS SPEECH RECOGNITION AND THE CONTENTS OF THE VOCABULARY
    MCQUEEN, JM
    CUTLER, A
    BRISCOE, T
    NORRIS, D
    LANGUAGE AND COGNITIVE PROCESSES, 1995, 10 (3-4): : 309 - 331
  • [28] Continuous Speech Recognition System for Chhattisgarhi
    Londhe, N. D.
    Kshirsagar, G. B.
    2017 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2017, : 365 - 369
  • [29] Gaussian mixture language models for speech recognition
    Afify, Mohamed
    Siohan, Olivier
    Sarikaya, Ruhi
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 29 - +
  • [30] Improving language models for radiology speech recognition
    Paulett, John M.
    Langlotz, Curtis P.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (01) : 53 - 58