An embedded multilingual speech recognition system for Mandarin, Cantonese, and English

被引:0
|
作者
Wang, X [1 ]
Cao, Y [1 ]
Ding, F [1 ]
Tang, YZ [1 ]
机构
[1] Nokia Res Ctr, Audio Visual Syst Lab, Beijing, Peoples R China
关键词
embedded multilingual speech recognition; non-native speech recognition; automatic language identification;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a small-footprint, speaker-independent, multilingual system for isolated word recognition of Mandarin. Cantonese, and English. The baseline system got very promising results without any phoneme shared between different languages. By sharing phonemes, the memory and computational complexity was reduced by about 40%. Non-native, accented speech recognition and mixed language words support are the distinguishing features of our system. Automatic language identification (LID) is one of the key elements in language-independent automatic speech recognition (ASR) systems. LID performance is also analyzed in addition to the engine performance of the proposed system. Supervised Bayesian online adaptation was proved to be effective in compensation for accent mismatch, environment mismatch, as well as for modeling inaccuracy introduced by combined training.
引用
收藏
页码:758 / 764
页数:7
相关论文
共 50 条
  • [21] Development of a Mandarin-English bilingual Speech Recognition System for real world music retrieval
    Zhang, Qingqing
    Pan, Jielin
    Lin, Yang
    Shao, Jian
    Yan, Yonghong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03) : 514 - 521
  • [22] Automatic speech recognition of Cantonese-English code-mixing utterances
    Chan, Joyce Y. C.
    Ching, P. C.
    Lee, Tan
    Cao, Houwei
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 113 - 116
  • [23] Towards Language-Universal Mandarin-English Speech Recognition
    Zhang, Shiliang
    Liu, Yuan
    Lei, Ming
    Ma, Bin
    Xie, Lei
    INTERSPEECH 2019, 2019, : 2170 - 2174
  • [24] CROSS-LINGUAL AND MULTILINGUAL SPEECH EMOTION RECOGNITION ON ENGLISH AND FRENCH
    Neumann, Michael
    Ngoc Thang Vu
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5769 - 5773
  • [25] An embedded system for speech recognition and compression
    Yang, ZZ
    Liu, J
    Eric, C
    Guan, LC
    Chin, CK
    International Symposium on Communications and Information Technologies 2005, Vols 1 and 2, Proceedings, 2005, : 287 - 290
  • [26] Research of Embedded Speech Recognition System
    Zhu Xuelai
    PROCEEDINGS OF THE THIRD INTERNATIONAL SYMPOSIUM ON TEST AUTOMATION & INSTRUMENTATION, VOLS 1 - 4, 2010, : 1481 - 1484
  • [27] A unified system for multilingual speech recognition and language identification
    Liu, Danyang
    Xu, Ji
    Zhang, Pengyuan
    Yan, Yonghong
    SPEECH COMMUNICATION, 2021, 127 : 17 - 28
  • [28] MULTI-PRONOUNCIATION DICTIONARY CONSTRUCTION FOR MANDARIN-ENGLISH BILINGUAL PHRASE SPEECH RECOGNITION SYSTEM
    Wang, C.
    Shi, W.
    Zou, Y. X.
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 15 - 19
  • [29] English Speech Recognition System on Chip
    刘鸿
    钱彦旻
    刘加
    TsinghuaScienceandTechnology, 2011, 16 (01) : 95 - 99
  • [30] Vowel discrimination by speakers of English, German, Japanese, Mandarin and Cantonese
    Bennett, DC
    Zhang, JL
    Lü, SN
    Hu, XH
    Abakuks, A
    Hazan, V
    LACUS FORUM XXVII: SPEAKING AND COMPREHENDING, 2001, 27 : 285 - 296