End to End Spoken Language Understanding Using Partial Disentangled Slot Embedding

被引:0
|
作者
Liu, Tan [1 ]
Guo, Wu [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
end to end; spoken language understanding; disentangled embedding;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Spoken language understanding (SLU) has switched from pipeline approaches to end-to-end (E2E) ones recently. For most E2E approaches, neural networks are adopted to extract embeddings from the audio signals directly for final intents prediction. In this paper, we explore this method for intent classification on Fluent Speech Commands (FSC) dataset, where intents are formed as combinations of three slots (action, object, and location). The information of different slots will be entangled with each other in the extracted embeddings, which sometimes brings about errors in the prediction of the current slot. To address this problem, we propose partial disentangled slot embedding (PDSE) method through adversarial training. Results show that the proposed method can achieve an error rate of 0.53%, which outperforms the baseline with over 35.3% error rate reduction.
引用
收藏
页码:1062 / 1066
页数:5
相关论文
共 50 条
  • [21] FROM AUDIO TO SEMANTICS: APPROACHES TO END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Haghani, Parisa
    Narayanan, Arun
    Bacchiani, Michiel
    Chuang, Galen
    Gaur, Neeraj
    Moreno, Pedro
    Prabhavalkar, Rohit
    Qu, Zhongdi
    Waters, Austin
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 720 - 726
  • [22] SPEECH-LANGUAGE PRE-TRAINING FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Qian, Yao
    Bianv, Ximo
    Shi, Yu
    Kanda, Naoyuki
    Shen, Leo
    Xiao, Zhen
    Zeng, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7458 - 7462
  • [23] Two-Pass Low Latency End-to-End Spoken Language Understanding
    Arora, Siddhant
    Dalmia, Siddharth
    Chang, Xuankai
    Yan, Brian
    Black, Alan
    Watanabe, Shinji
    INTERSPEECH 2022, 2022, : 3478 - 3482
  • [24] Low-bit Shift Network for End-to-End Spoken Language Understanding
    Avila, Anderson R.
    Bibi, Khalil
    Yang, Ruiheng
    Li, Xinlin
    Xing, Chao
    Chen, Xiao
    INTERSPEECH 2022, 2022, : 2698 - 2702
  • [25] Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
    Kim, Suyoun
    Shrivastava, Akshat
    Duc Le
    Lin, Ju
    Kalinli, Ozlem
    Seltzer, Michael L.
    INTERSPEECH 2023, 2023, : 1119 - 1123
  • [26] A low latency ASR-free end to end spoken language understanding system
    Mhiri, Mohamed
    Myer, Samuel
    Tomar, Vikrant Singh
    INTERSPEECH 2020, 2020, : 1947 - 1951
  • [27] END-TO-END SPOKEN LANGUAGE UNDERSTANDING WITHOUT MATCHED LANGUAGE SPEECH MODEL PRETRAINING DATA
    Price, Ryan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7979 - 7983
  • [28] TOWARDS END-TO-END INTEGRATION OF DIALOG HISTORY FOR IMPROVED SPOKEN LANGUAGE UNDERSTANDING
    Sunder, Vishal
    Thomas, Samuel
    Kuo, Hong-Kwang J.
    Ganhotra, Jatin
    Kingsbury, Brian
    Fosler-Lussier, Eric
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7497 - 7501
  • [29] Confidence measure for speech-to-concept end-to-end spoken language understanding
    Caubriere, Antoine
    Esteve, Yannick
    Laurent, Antoine
    Morin, Emmanuel
    INTERSPEECH 2020, 2020, : 1590 - 1594
  • [30] Speech Model Pre-training for End-to-End Spoken Language Understanding
    Lugosch, Loren
    Ravanelli, Mirco
    Ignoto, Patrick
    Tomar, Vikrant Singh
    Bengio, Yoshua
    INTERSPEECH 2019, 2019, : 814 - 818