Improving Speech Understanding Accuracy with Limited Training Data Using Multiple Language Models and Multiple Understanding Models

被引:0
|
作者
Katsumaru, Masaki [1 ]
Nakano, Mikio [2 ]
Komatani, Kazunori [1 ]
Funakoshi, Kotaro [2 ]
Ogata, Tetsuya [1 ]
Okuno, Hiroshi G. [1 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Kyoto, Japan
[2] Honda Res Inst Japan Co Ltd, Kisarazu, Chiba, Japan
关键词
speech understanding; multiple language models and language understanding models; limited training data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We aim to improve a speech understanding module with a small amount of training data. A speech understanding module uses a language model (LM) and a language understanding model (LUM). A lot of training data are needed to improve the models. Such data collection is, however, difficult in an actual process of development. We therefore design and develop a new framework that uses multiple LMs and LUMs to improve speech understanding accuracy under various amounts of training data. Even if the amount of available training data is small, each LM and each LUM can deal well with different types of utterances and more utterances are understood by using multiple LM and LUM. As one implementation of the framework, we develop a method for selecting the most appropriate speech understanding result from several candidates. The selection is based on probabilities of correctness calculated by logistic regressions. We evaluate our framework with various amounts of training data.
引用
收藏
页码:2699 / +
页数:2
相关论文
共 50 条
  • [41] Discriminative Models for Spoken Language Understanding
    Wang, Ye-Yi
    Acero, Alex
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2426 - 2429
  • [42] Limited training data robust speech recognition using kernel-based acoustic models
    Schaffoener, Martin
    Krueger, Sven E.
    Andelic, Edin
    Katz, Marcel
    Wendemuth, Andreas
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 1137 - 1140
  • [43] Understanding corporate data models
    Shanks, G
    Darke, P
    INFORMATION & MANAGEMENT, 1999, 35 (01) : 19 - 30
  • [44] Understanding political polarization using language models: A dataset and method
    Gode, Samiran
    Bare, Supreeth
    Raj, Bhiksha
    Yoo, Hyungon
    AI MAGAZINE, 2023, 44 (03) : 248 - 254
  • [45] Multilingual Spoken Language Understanding using graphs and multiple translations
    Calvo, Marcos
    Hurtado, Lluis-Felip
    Garcia, Fernando
    Sanchis, Emilio
    Segarra, Encarna
    COMPUTER SPEECH AND LANGUAGE, 2016, 38 : 86 - 103
  • [46] A language for multiple models of computation
    Björklund, D
    Lilius, J
    CODES 2002: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON HARDWARE/SOFTWARE CODESIGN, 2002, : 25 - 30
  • [47] IMPROVING SPEECH RECOGNITION ACCURACY OF LOCAL POI USING GEOGRAPHICAL MODELS
    Cao, Songjun
    Zhang, Yike
    Feng, Xiaobing
    Ma, Long
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 180 - 185
  • [48] MARKOV MODELS IN SPEECH RECOGNITION AND UNDERSTANDING.
    Ciaramella, A.
    Cravero, M.
    Fissore, L.
    Pieraccini, R.
    Pirani, G.
    Raineri, F.
    Venuti, G.
    CSELT Technical Reports, 1986, 14 (04): : 293 - 296
  • [49] ROBUST SPEECH RECOGNITION USING MULTIPLE PRIOR MODELS FOR SPEECH RECONSTRUCTION
    Narayanan, Arun
    Zhao, Xiaojia
    Wang, DeLiang
    Fosler-Lussier, Eric
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4800 - 4803
  • [50] IMPROVING END-TO-END MODELS FOR SET PREDICTION IN SPOKEN LANGUAGE UNDERSTANDING
    Kuo, Hong-Kwang J.
    Tuske, Zoltan
    Thomas, Samuel
    Kingsbury, Brian
    Saon, George
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7162 - 7166