Full expansion of context-dependent networks in large vocabulary speech recognition

被引:0
|
作者
Mohri, M [1 ]
Riley, M [1 ]
Hindle, D [1 ]
Ljolje, A [1 ]
Pereira, F [1 ]
机构
[1] AT&T Bell Labs, Res, Florham Park, NJ 07932 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We combine our earlier approach to context-dependent network representation with our algorithm for determinizing weighted networks to build optimized networks for large-vocabulary speech recognition combining an n-gram language model, a pronunciation dictionary and context-dependency modeling. While fully-expanded networks have been used before in restrictive settings (medium vocabulary or no cross-word contexts), we demonstrate that our network determinization method makes it practical to use fully-expanded networks also in large-vocabulary recognition with full cross-word context modeling. For the DARPA North American Business News task (NAB), we give network sizes and recognition speeds and accuracies using bigram and trigram grammars with vocabulary sizes ranging from 10,000 to 160,000 words. With our construction, the fully-expanded NAB context-dependent networks contain only about twice as many arcs as the corresponding language models. Interestingly, we also find that, with these networks, real-time word accuracy is improved by increasing vocabulary size and n-gram order.
引用
收藏
页码:665 / 668
页数:4
相关论文
共 50 条
  • [21] Research on context-dependent acoustical unit (triphone) for mandarin continuous speech recognition
    Zhao, Qingwei
    Wang, Zuoying
    Lu, Dajin
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 1999, 27 (06): : 79 - 82
  • [22] Integration of context-dependent durational knowledge into HMM-based speech recognition
    Wang, X
    tenBosch, LFM
    Pols, LCW
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1073 - 1076
  • [23] FROM SENONES TO CHENONES: TIED CONTEXT-DEPENDENT GRAPHEMES FOR HYBRID SPEECH RECOGNITION
    Le, Duc
    Zhang, Xiaohui
    Zheng, Weiyi
    Fugen, Christian
    Zweig, Geoffrey
    Seltzer, Michael L.
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 457 - 464
  • [24] Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition
    Jaitly, Navdeep
    Patrick Nguyen
    Senior, Andrew
    Vanhoucke, Vincent
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2577 - 2580
  • [25] Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
    Wu, Jibin
    Yilmaz, Emre
    Zhang, Malu
    Li, Haizhou
    Tan, Kay Chen
    FRONTIERS IN NEUROSCIENCE, 2020, 14
  • [26] EXPLOITING SPARSENESS IN DEEP NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
    Yu, Dong
    Seide, Frank
    Li, Gang
    Deng, Li
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4409 - 4412
  • [27] Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks
    Yu, Dong
    Deng, Li
    Seide, Frank
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 6 - 9
  • [28] Large vocabulary speech recognition in French
    Adda-Decker, M
    Adda, G
    Gauvain, JL
    Lamel, L
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 45 - 48
  • [29] Advances in Large Vocabulary Speech Recognition
    Gauvain, JL
    De Mori, R
    Lamel, L
    COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01): : 1 - 3
  • [30] Large vocabulary speech recognition in French
    Adda-Decker, Martine
    Adda, Gilles
    Gauvain, Jean-Luc
    Lamel, Lori
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 1 : 45 - 48