An overview of decoding techniques for large vocabulary continuous speech recognition

被引：56

作者：

Aubert, XL ^{[1
]}

机构：

[1] Philips Res Labs, D-52066 Aachen, Germany

来源：

COMPUTER SPEECH AND LANGUAGE | 2002年 / 16卷 / 01期

关键词：

D O I：

10.1006/csla.2001.0185

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A number of decoding strategies for large vocabulary continuous speech recognition (LVCSR) are examined from the viewpoint of their search space representation. Different design solutions are compared with respect to the integration of linguistic and acoustic constraints, as implied by m-gram language models (LM) and cross-word (CW) phonetic contexts. This study is structured along two main axes: the network expansion and the search algorithm itself. The network can be expanded statically or dynamically while the search can proceed either time-synchronously or asynchronously which leads to distinct architectures. Three broad classes of decoding methods are briefly reviewed: the use of weighted finite state transducers (WFST) for static network expansion, the time-synchronous dynamic-expansion search and the asynchronous stack decoding. Heuristic methods for further reducing the search space are also considered. The main approaches are compared and some prospective views are formulated regarding possible future avenues. (C) 2002 Academic Press.

引用

页码：89 / 114

页数：26

共 50 条

[41] A large-vocabulary continuous speech recognition system for Hindi
Kumar, M
Rajput, N
Verma, A
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2004, 48 (5-6) : 703 - 715
[42] Large-Vocabulary Continuous Speech Recognition of Lhasa Tibetan
Li, Guanyu
Yu, Hongzhi
COMPUTER AND INFORMATION TECHNOLOGY, 2014, 519-520 : 802 - 806
[43] Phone deactivation pruning in large vocabulary continuous speech recognition
Renals, S
IEEE SIGNAL PROCESSING LETTERS, 1996, 3 (01) : 4 - 6
[44] Speaker verification through large vocabulary continuous speech recognition
Newman, M
Gillick, L
Ito, Y
McAllaster, D
Peskin, B
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2419 - 2422
[45] Connectionist language modeling for large vocabulary continuous speech recognition
Schwenk, H
Gauvain, JL
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 765 - 768
[46] Syllable-based large vocabulary continuous speech recognition
Ganapathiraju, A
Hamaker, J
Picone, J
Ordowski, M
Doddington, GR
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (04): : 358 - 366
[47] Integrating Stress Information in Large Vocabulary Continuous Speech Recognition
Ludusan, Bogdan
Ziegler, Stefan
Gravier, Guillaume
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2641 - 2644
[48] Speaker selection training for large vocabulary continuous speech recognition
Huang, C
Chen, T
Chang, E
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 609 - 612
[49] IMPROVEMENTS ON BOTTLENECK FEATURE FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Tuerxun, Maimaitiaili
Zhang, Shiliang
Bao, Yebo
Dai, Lirong
2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 516 - 520
[50] A LAYERED APPROACH FOR DUTCH LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Pelemans, Joris
Demuynck, Kris
Wambacq, Patrick
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4421 - 4424

← 1 2 3 4 5 →