Syllable language models for Mandarin speech recognition: Exploiting character language models

被引:18
|
作者
Liu, Xunying [1 ]
Hieronymus, James L. [2 ]
Gales, Mark J. F. [1 ]
Woodland, Philip C. [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
[2] Int Comp Sci Inst, Berkeley, CA 94704 USA
来源
关键词
CHINESE-LANGUAGE; ADAPTATION; ALGORITHM;
D O I
10.1121/1.4768800
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance. (C) 2013 Acoustical Society of America. [http://dx.doi.org/10.1121/1.4768800]
引用
收藏
页码:519 / 528
页数:10
相关论文
共 50 条
  • [31] Neural candidate-aware language models for speech recognition
    Tanaka, Tomohiro
    Masumura, Ryo
    Oba, Takanobu
    COMPUTER SPEECH AND LANGUAGE, 2021, 66
  • [32] SPEECH RECOGNITION - ACOUSTIC, PHONETIC AND FORMAL-LANGUAGE MODELS
    MERMELSTEIN, P
    LEVINSON, S
    BIOTELEMETRY, 1975, 2 (1-2) : 121 - 123
  • [33] Morpholexical and Discriminative Language Models for Turkish Automatic Speech Recognition
    Sak, Hasim
    Saraclar, Murat
    Gungor, Tunga
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2341 - 2351
  • [34] Acoustic and Language Models Adaptation for Indonesian Spontaneous Speech Recognition
    Lestari, Dessi Puji
    Irfani, Angela
    2015 2ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS: CONCEPTS, THEORY AND APPLICATIONS ICAICTA, 2015,
  • [35] Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition
    Akita, Yuya
    Kawahara, Tatsuya
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1539 - 1549
  • [36] K-TLSS(S) language models for speech recognition
    Bordel, G
    Varona, A
    Torres, MI
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS, 1997, : 819 - 822
  • [37] Speaker Independent Speech Recognition Implementation with Adaptive Language Models
    Anukriti
    Tiwari, Sushant
    Chatterjee, Tanmay
    Bhattacharya, Mahua
    2013 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI), 2013, : 7 - 10
  • [38] VioLA: Conditional Language Models for Speech Recognition, Synthesis, and Translation
    Wang, Tianrui
    Zhou, Long
    Zhang, Ziqiang
    Wu, Yu
    Liu, Shujie
    Gaur, Yashesh
    Chen, Zhuo
    Li, Jinyu
    Wei, Furu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3709 - 3716
  • [39] LSTM-Based Language Models for Spontaneous Speech Recognition
    Medennikov, Ivan
    Bulusheva, Anna
    SPEECH AND COMPUTER, 2016, 9811 : 469 - 475
  • [40] Development of Language Models for Continuous Uzbek Speech Recognition System
    Mukhamadiyev, Abdinabi
    Mukhiddinov, Mukhriddin
    Khujayarov, Ilyos
    Ochilov, Mannon
    Cho, Jinsoo
    SENSORS, 2023, 23 (03)