Introduction of Semantic Model to Help Speech Recognition

被引:1
|
作者
Level, Stephane [1 ]
Illina, Irina [1 ]
Fohr, Dominique [1 ]
机构
[1] Univ Lorraine, INRIA, CNRS, F-54000 Nancy, France
来源
TEXT, SPEECH, AND DIALOGUE (TSD 2020) | 2020年 / 12284卷
关键词
Automatic Speech Recognition; Semantic context; Embeddings;
D O I
10.1007/978-3-030-58323-1_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current Automatic Speech Recognition (ASR) systems mainly take into account acoustic, lexical and local syntactic information. Long term semantic relations are not used. ASR systems significantly decrease performance when the training conditions and the testing conditions differ due to the noise, etc.. In this case the acoustic information can be less reliable. To help noisy ASR system, we propose to supplement ASR system with a semantic module. This module re-evaluates the N-best speech recognition hypothesis list and can be seen as a form of adaptation in the context of noise. For the words in the processed sentence that could have been poorly recognized, this module chooses words that correspond better to the semantic context of the sentence. To achieve this, we introduced the notions of a context part and possibility zones that measure the similarity between the semantic context of the document and the corresponding possible hypothesis. The proposed methodology uses two continuous representations of words: word2vec and FastText. We conduct experiments on the publicly available TED conferences dataset (TED-LIUM) mixed with real noise. The proposed method achieves a significant improvement of the word error rate (WER) over the ASR system without semantic information.
引用
收藏
页码:377 / 385
页数:9
相关论文
共 50 条
  • [1] SEMANTIC CACHE MODEL DRIVEN SPEECH RECOGNITION
    Lecouteux, Benjamin
    Nocera, Pascal
    Linares, Georges
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4386 - 4389
  • [2] Introduction of the speaking rate in the model of speech recognition
    Yousfi, A
    Meziane, A
    INTERNATIONAL CONFERENCE ON PARALLEL COMPUTING IN ELECTRICAL ENGINEERING - PARELEC 2000, PROCEEDINGS, 2000, : 64 - 66
  • [3] Semantic Communications for Speech Recognition
    Weng, Zhenzi
    Qin, Zhijin
    Li, Geoffrey Ye
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [4] An introduction to speech recognition
    DeBleecker, MR
    GLOBAL VISION, 1996, : 251 - 254
  • [5] SYNTACTIC AND SEMANTIC POSTPROCESSING FOR SPEECH RECOGNITION
    KRALLMANN, H
    MARZI, R
    DECISION SUPPORT SYSTEMS, 1991, 7 (03) : 253 - 261
  • [6] AN INTRODUCTION TO SPEECH AND SPEAKER RECOGNITION
    PEACOCKE, RD
    GRAF, DH
    COMPUTER, 1990, 23 (08) : 26 - 33
  • [7] CAN SPEECH RECOGNITION MACHINES HELP DEAF
    NEWELL, AF
    TEACHER OF THE DEAF, 1974, 72 (428): : 367 - 374
  • [8] Joint Decoding for Speech Recognition and Semantic Tagging
    Deoras, Anoop
    Sarikaya, Ruhi
    Tur, Gokhan
    Hakkani-Tuer, Dilek
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1066 - 1069
  • [9] Assistive Robot for Speech Semantic Recognition System
    Mohamad, Siti Nur Ateeqa
    Isa, Khalid
    PROCEEDINGS OF THE 2018 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING (ICCCE), 2018, : 50 - 55
  • [10] Latent semantic language modeling for speech recognition
    Bellegarda, JR
    MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING, 2004, 138 : 73 - 103