An overview of decoding techniques for large vocabulary continuous speech recognition

被引:56
|
作者
Aubert, XL [1 ]
机构
[1] Philips Res Labs, D-52066 Aachen, Germany
来源
COMPUTER SPEECH AND LANGUAGE | 2002年 / 16卷 / 01期
关键词
D O I
10.1006/csla.2001.0185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A number of decoding strategies for large vocabulary continuous speech recognition (LVCSR) are examined from the viewpoint of their search space representation. Different design solutions are compared with respect to the integration of linguistic and acoustic constraints, as implied by m-gram language models (LM) and cross-word (CW) phonetic contexts. This study is structured along two main axes: the network expansion and the search algorithm itself. The network can be expanded statically or dynamically while the search can proceed either time-synchronously or asynchronously which leads to distinct architectures. Three broad classes of decoding methods are briefly reviewed: the use of weighted finite state transducers (WFST) for static network expansion, the time-synchronous dynamic-expansion search and the asynchronous stack decoding. Heuristic methods for further reducing the search space are also considered. The main approaches are compared and some prospective views are formulated regarding possible future avenues. (C) 2002 Academic Press.
引用
收藏
页码:89 / 114
页数:26
相关论文
共 50 条
  • [31] A LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION SYSTEM WITH HIGH PREDICTABILITY
    SHIGENAGA, M
    SEKIGUCHI, Y
    YAMAGUCHI, T
    MASUDA, R
    IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1817 - 1825
  • [32] Feature selection in mandarin large vocabulary continuous speech recognition
    Zhu, X
    Chen, YN
    Liu, J
    Liu, RS
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 508 - 511
  • [33] Using a transcription graph for large vocabulary continuous speech recognition
    Li, Z
    OShaughnessy, D
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 121 - 124
  • [34] DISTRIBUTED SUBMODULAR MAXIMIZATION FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Qi, Jun
    Liu, Xu
    Kamijo, Shunshuke
    Tejedor, Javier
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2501 - 2505
  • [35] A word graph algorithm for large vocabulary continuous speech recognition
    Ortmanns, S
    Ney, H
    Aubert, X
    COMPUTER SPEECH AND LANGUAGE, 1997, 11 (01): : 43 - 72
  • [36] A large vocabulary continuous speech recognition system for Persian language
    Hossein Sameti
    Hadi Veisi
    Mohammad Bahrani
    Bagher Babaali
    Khosro Hosseinzadeh
    EURASIP Journal on Audio, Speech, and Music Processing, 2011
  • [37] Large Vocabulary Continuous Audio-Visual Speech Recognition
    Sterpu, George
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 538 - 541
  • [38] On designing pronunciation lexicons for large vocabulary, continuous speech recognition
    Lamel, L
    Adda, G
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 6 - 9
  • [39] DEEP-FSMN FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Zhang, Shiliang
    Lei, Ming
    Yan, Zhijie
    Dai, Lirong
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5869 - 5873
  • [40] Large-vocabulary continuous speech recognition: Advances and applications
    Gauvain, JL
    Lamel, L
    PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1181 - 1200