SPEECH RECOGNITION OF FOREIGN OUT-OF-VOCABULARY WORDS USING A HIERARCHICAL LANGUAGE MODEL

被引:0
|
作者
Yamamoto, Hirofumi [1 ]
Kikui, Genichiro [2 ]
Nakamura, Satoshi [1 ,2 ]
Sagisaka, Yoshinori [1 ,3 ]
机构
[1] Natl Inst Informat & Commun Technol, 2-2-2 Hikaridai, Seika, Kyoto, Japan
[2] ATR Spoken Language Commun Res Labs, Kyoto, Japan
[3] Waseda Univ, GITI, Tokyo, Japan
关键词
Speech Recognition; Language model; Foreign word; Out-of-Vocabulalry word; Hierarchical language model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new speech recognition scheme for foreign out-of-vocabulary words embedded in native-language speech. To recognize foreign names frequently observed in news speech or in translation speech, we adopted a hierarchical language model that had been successfully applied to OOV words covering native vocabularies. In this hierarchical language model, OOV vocabularies are modeled as a word-class model in the upper-layered model, and their statistical phonotactic constraints are modeled in the lower-layered model. Since extra statistics are needed to cover foreign words and their pronunciation differences, we have introduced two techniques. The first is to combine translation target language models and translation source statistics of OOVs using the hierarchical language model. The second is to automatically generate recognition target pronunciations from original pronunciations by syllable-to-syllable mapping. To confirm the validity of this recognition scheme, we have conducted speech recognition experiments using English speech including Japanese personal names as OOV words. The proposed method outperformed the existing algorithm using a lexicon consisting of all the words in the training set. Surprisingly, it achieved better OOV recognition results than the non-OOV condition where all the proper names in the test set are registered in the lexicon.
引用
收藏
页码:1870 / +
页数:2
相关论文
共 50 条
  • [31] Few-Shot Representation Learning for Out-Of-Vocabulary Words
    Hu, Ziniu
    Chen, Ting
    Chang, Kai-Wei
    Sun, Yizhou
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4102 - 4112
  • [32] Multi-level out-of-vocabulary words handling approach
    Lochter, Johannes V.
    Silva, Renato M.
    Almeida, Tiago A.
    KNOWLEDGE-BASED SYSTEMS, 2022, 251
  • [33] Improving Recognition of Out-of-vocabulary Words in E2E Code-switching ASR by Fusing Speech Generation Methods
    Ye, Lingxuan
    Cheng, Gaofeng
    Yang, Runyan
    Yang, Zehui
    Tian, Sanli
    Zhang, Pengyuan
    Yan, Yonghong
    INTERSPEECH 2022, 2022, : 3163 - 3167
  • [34] Exploiting Out-of-Vocabulary Words for Out-of-Domain Detection in Dialog Systems
    Ryu, Seonghan
    Lee, Donghyeon
    Lee, Gary Geunbae
    Kim, Kyungduk
    Noh, Hyungjong
    2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 165 - +
  • [35] Detection Of Pronunciation Out Of Vocabulary Words From Speech Recognition System
    Degtyarev, Vladimir M.
    Gusev, Mikhail N.
    EUROCON 2009: INTERNATIONAL IEEE CONFERENCE DEVOTED TO THE 150 ANNIVERSARY OF ALEXANDER S. POPOV, VOLS 1- 4, PROCEEDINGS, 2009, : 1723 - 1728
  • [36] Predicting the out-of-vocabulary rate and the required vocabulary size for speech processing applications
    Muller, J
    Stahl, H
    Lang, M
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1922 - 1925
  • [37] Using the Web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition
    Oprean, Cristina
    Likforman-Sulem, Laurence
    Popescu, Adrian
    Mokbel, Chafic
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 989 - 993
  • [38] Out-of-vocabulary word rejection algorithm in Korean variable vocabulary word recognition
    Moon, KS
    Kim, YJ
    Kim, HR
    Chung, JH
    ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL V: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 53 - 56
  • [39] A Large Corpus of Product Reviews in Portuguese: Tackling Out-Of-Vocabulary Words
    Hartmann, Nathan S.
    Avanco, Lucas V.
    Balage, Pedro P.
    Duran, Magali S.
    Nunes, Maria G. V.
    Pardo, Thiago A. S.
    Aluisio, Sandra M.
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3865 - 3871
  • [40] Online PLSA: Batch Updating Techniques Including Out-of-Vocabulary Words
    Bassiou, Nikoletta K.
    Kotropoulos, Constantine L.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (11) : 1953 - 1966