Syllable Based Language Model for Large Vocabulary Continuous Speech Recognition of Polish

被引:0
|
作者
Majewski, Piotr [1 ]
机构
[1] Univ Lodz, Fac Math & Comp Sci, PL-90238 Lodz, Poland
来源
关键词
Polish; large vocabulary continuous speech recognition; language modeling; sub-word units; syllable-based units;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of state-of-the-art large vocabulary continuous speech recognition systems use word-based n-gram language models. Such models are not optimal solution for inflectional or agglutinative languages. The Polish language is highly inflectional one and requires a very large corpora to create a sufficient language model with the small out-of-vocabulary ratio. We propose a syllable-based language model. which is better suited to highly inflectional language like Polish. In case of lack of resources (i.e. small corpora) syllable-based model outperforms word-based models in terms of number of out-of-vocabulary units (syllables in our model). Such model is an approximation of the morphene-based model for Polish. In our paper, we show results of evaluation of syllable based model and its usefulness in speech recognition tasks.
引用
收藏
页码:397 / 401
页数:5
相关论文
共 50 条
  • [31] Experimenting with lipreading for large vocabulary continuous speech recognition
    Palecek, Karel
    JOURNAL ON MULTIMODAL USER INTERFACES, 2018, 12 (04) : 309 - 318
  • [32] Confidence measures for large vocabulary continuous speech recognition
    Wessel, F
    Schlüter, R
    Macherey, K
    Ney, H
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 288 - 298
  • [33] CONNECTIONIST APPROACHES TO LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    SAWAI, H
    MINAMI, Y
    MIYATAKE, M
    WAIBEL, A
    SHIKANO, K
    IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1834 - 1844
  • [34] Boosting systems for large vocabulary continuous speech recognition
    Saon, George
    Soltau, Hagen
    SPEECH COMMUNICATION, 2012, 54 (02) : 212 - 218
  • [35] Experimenting with lipreading for large vocabulary continuous speech recognition
    Karel Paleček
    Journal on Multimodal User Interfaces, 2018, 12 : 309 - 318
  • [36] Large-Vocabulary Continuous Speech Recognition Systems
    Saon, George
    Chien, Jen-Tzung
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 18 - 33
  • [37] Recent Developments in Large Vocabulary Continuous Speech Recognition
    Saon, George
    Chien, Jen-Tzung
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [38] A Myanmar Large Vocabulary Continuous Speech Recognition System
    Naing, Hay Mar Soe
    Hlaing, Aye Mya
    Pa, Win Pa
    Hu, Xinhui
    Thu, Ye Kyaw
    Hori, Chiori
    Kawai, Hisashi
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 320 - 327
  • [39] Investigation on large vocabulary continuous Kannada speech recognition
    Vanajakshi, Puttaswamy Gowda
    Mathivanan, M.
    Kumaran, T. Senthil
    INTERNATIONAL JOURNAL OF BIOMEDICAL ENGINEERING AND TECHNOLOGY, 2021, 36 (01) : 1 - 24
  • [40] Korean large vocabulary continuous speech recognition with morpheme-based recognition units
    Kwon, OW
    Park, J
    SPEECH COMMUNICATION, 2003, 39 (3-4) : 287 - 300