An overview of decoding techniques for large vocabulary continuous speech recognition

被引：56

作者：

Aubert, XL ^{[1
]}

机构：

[1] Philips Res Labs, D-52066 Aachen, Germany

来源：

COMPUTER SPEECH AND LANGUAGE | 2002年 / 16卷 / 01期

关键词：

D O I：

10.1006/csla.2001.0185

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A number of decoding strategies for large vocabulary continuous speech recognition (LVCSR) are examined from the viewpoint of their search space representation. Different design solutions are compared with respect to the integration of linguistic and acoustic constraints, as implied by m-gram language models (LM) and cross-word (CW) phonetic contexts. This study is structured along two main axes: the network expansion and the search algorithm itself. The network can be expanded statically or dynamically while the search can proceed either time-synchronously or asynchronously which leads to distinct architectures. Three broad classes of decoding methods are briefly reviewed: the use of weighted finite state transducers (WFST) for static network expansion, the time-synchronous dynamic-expansion search and the asynchronous stack decoding. Heuristic methods for further reducing the search space are also considered. The main approaches are compared and some prospective views are formulated regarding possible future avenues. (C) 2002 Academic Press.

引用

页码：89 / 114

页数：26

共 50 条

[31] A LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION SYSTEM WITH HIGH PREDICTABILITY
SHIGENAGA, M
SEKIGUCHI, Y
YAMAGUCHI, T
MASUDA, R
IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1817 - 1825
[32] Feature selection in mandarin large vocabulary continuous speech recognition
Zhu, X
Chen, YN
Liu, J
Liu, RS
2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 508 - 511
[33] Using a transcription graph for large vocabulary continuous speech recognition
Li, Z
OShaughnessy, D
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 121 - 124
[34] DISTRIBUTED SUBMODULAR MAXIMIZATION FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Qi, Jun
Liu, Xu
Kamijo, Shunshuke
Tejedor, Javier
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2501 - 2505
[35] A word graph algorithm for large vocabulary continuous speech recognition
Ortmanns, S
Ney, H
Aubert, X
COMPUTER SPEECH AND LANGUAGE, 1997, 11 (01): : 43 - 72
[36] A large vocabulary continuous speech recognition system for Persian language
Hossein Sameti
Hadi Veisi
Mohammad Bahrani
Bagher Babaali
Khosro Hosseinzadeh
EURASIP Journal on Audio, Speech, and Music Processing, 2011
[37] Large Vocabulary Continuous Audio-Visual Speech Recognition
Sterpu, George
ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 538 - 541
[38] On designing pronunciation lexicons for large vocabulary, continuous speech recognition
Lamel, L
Adda, G
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 6 - 9
[39] DEEP-FSMN FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Zhang, Shiliang
Lei, Ming
Yan, Zhijie
Dai, Lirong
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5869 - 5873
[40] Large-vocabulary continuous speech recognition: Advances and applications
Gauvain, JL
Lamel, L
PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1181 - 1200

← 1 2 3 4 5 →