An overview of decoding techniques for large vocabulary continuous speech recognition

被引:56
|
作者
Aubert, XL [1 ]
机构
[1] Philips Res Labs, D-52066 Aachen, Germany
来源
COMPUTER SPEECH AND LANGUAGE | 2002年 / 16卷 / 01期
关键词
D O I
10.1006/csla.2001.0185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A number of decoding strategies for large vocabulary continuous speech recognition (LVCSR) are examined from the viewpoint of their search space representation. Different design solutions are compared with respect to the integration of linguistic and acoustic constraints, as implied by m-gram language models (LM) and cross-word (CW) phonetic contexts. This study is structured along two main axes: the network expansion and the search algorithm itself. The network can be expanded statically or dynamically while the search can proceed either time-synchronously or asynchronously which leads to distinct architectures. Three broad classes of decoding methods are briefly reviewed: the use of weighted finite state transducers (WFST) for static network expansion, the time-synchronous dynamic-expansion search and the asynchronous stack decoding. Heuristic methods for further reducing the search space are also considered. The main approaches are compared and some prospective views are formulated regarding possible future avenues. (C) 2002 Academic Press.
引用
收藏
页码:89 / 114
页数:26
相关论文
共 50 条
  • [41] A large-vocabulary continuous speech recognition system for Hindi
    Kumar, M
    Rajput, N
    Verma, A
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2004, 48 (5-6) : 703 - 715
  • [42] Large-Vocabulary Continuous Speech Recognition of Lhasa Tibetan
    Li, Guanyu
    Yu, Hongzhi
    COMPUTER AND INFORMATION TECHNOLOGY, 2014, 519-520 : 802 - 806
  • [43] Phone deactivation pruning in large vocabulary continuous speech recognition
    Renals, S
    IEEE SIGNAL PROCESSING LETTERS, 1996, 3 (01) : 4 - 6
  • [44] Speaker verification through large vocabulary continuous speech recognition
    Newman, M
    Gillick, L
    Ito, Y
    McAllaster, D
    Peskin, B
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2419 - 2422
  • [45] Connectionist language modeling for large vocabulary continuous speech recognition
    Schwenk, H
    Gauvain, JL
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 765 - 768
  • [46] Syllable-based large vocabulary continuous speech recognition
    Ganapathiraju, A
    Hamaker, J
    Picone, J
    Ordowski, M
    Doddington, GR
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (04): : 358 - 366
  • [47] Integrating Stress Information in Large Vocabulary Continuous Speech Recognition
    Ludusan, Bogdan
    Ziegler, Stefan
    Gravier, Guillaume
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2641 - 2644
  • [48] Speaker selection training for large vocabulary continuous speech recognition
    Huang, C
    Chen, T
    Chang, E
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 609 - 612
  • [49] IMPROVEMENTS ON BOTTLENECK FEATURE FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Tuerxun, Maimaitiaili
    Zhang, Shiliang
    Bao, Yebo
    Dai, Lirong
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 516 - 520
  • [50] A LAYERED APPROACH FOR DUTCH LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Pelemans, Joris
    Demuynck, Kris
    Wambacq, Patrick
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4421 - 4424