Syllable Based Language Model for Large Vocabulary Continuous Speech Recognition of Polish

被引:0
|
作者
Majewski, Piotr [1 ]
机构
[1] Univ Lodz, Fac Math & Comp Sci, PL-90238 Lodz, Poland
来源
关键词
Polish; large vocabulary continuous speech recognition; language modeling; sub-word units; syllable-based units;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of state-of-the-art large vocabulary continuous speech recognition systems use word-based n-gram language models. Such models are not optimal solution for inflectional or agglutinative languages. The Polish language is highly inflectional one and requires a very large corpora to create a sufficient language model with the small out-of-vocabulary ratio. We propose a syllable-based language model. which is better suited to highly inflectional language like Polish. In case of lack of resources (i.e. small corpora) syllable-based model outperforms word-based models in terms of number of out-of-vocabulary units (syllables in our model). Such model is an approximation of the morphene-based model for Polish. In our paper, we show results of evaluation of syllable based model and its usefulness in speech recognition tasks.
引用
收藏
页码:397 / 401
页数:5
相关论文
共 50 条
  • [41] Specifics of hidden Markov model modifications for large vocabulary continuous speech recognition
    Silingas, D
    Telksnys, L
    INFORMATICA, 2004, 15 (01) : 93 - 110
  • [42] Language-model look-ahead for large vocabulary speech recognition
    Ortmanns, S
    Ney, H
    Eiden, A
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2095 - 2098
  • [43] Deep learning based large vocabulary continuous speech recognition of an under-resourced language Bangladeshi Bangla
    Samin, Ahnaf Mozib
    Kobir, M. Humayon
    Kibria, Shafkat
    Rahman, M. Shahidur
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2021, 42 (05) : 252 - 260
  • [44] Towards speech rate independence in large vocabulary continuous speech recognition
    Martinez, F
    Tapias, D
    Alvarez, J
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 725 - 728
  • [45] Scalable HMM based Inference Engine in Large Vocabulary Continuous Speech Recognition
    Chong, Jike
    You, Kisun
    Yi, Youngmin
    Gonina, Ekaterina
    Hughes, Christopher
    Sung, Wonyong
    Keutzer, Kurt
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1793 - +
  • [46] Large Vocabulary Continuous Speech Recognition With Reservoir-Based Acoustic Models
    Triefenbach, Fabian
    Demuynck, Kris
    Martens, Jean-Pierre
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (03) : 311 - 315
  • [47] Response Probability Based Decoding Algorithm for Large Vocabulary Continuous Speech Recognition
    Yang, Zhanlei
    Chao, Hao
    Liu, Wenju
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1940 - 1943
  • [48] Extra Large Vocabulary Continuous Speech Recognition Algorithm based on Information Retrieval
    Pylypenko, Valeriy
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1809 - 1812
  • [49] Parallel Scalability in Speech Recognition Inference engines in large vocabulary continuous speech recognition
    You, Kisun
    Chong, Jike
    Yi, Youngmin
    Gonina, Ekaterina
    Hughes, Christopher J.
    Chen, Yen-Kuang
    Sung, Wonyong
    Keutzer, Kurt
    IEEE SIGNAL PROCESSING MAGAZINE, 2009, 26 (06) : 124 - 135
  • [50] Improved Syllable Based Acoustic Modeling by Inter-syllable Transition Model for Continuous Chinese Speech Recognition
    Chao, Hao
    Liu, Wenju
    PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 654 - 657