End to End Spoken Language Understanding Using Partial Disentangled Slot Embedding

被引：0

作者：

Liu, Tan ^{[1
]}

Guo, Wu ^{[1
]}

机构：

[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China

来源：

2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2021年

基金：

中国国家自然科学基金;

关键词：

end to end; spoken language understanding; disentangled embedding;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Spoken language understanding (SLU) has switched from pipeline approaches to end-to-end (E2E) ones recently. For most E2E approaches, neural networks are adopted to extract embeddings from the audio signals directly for final intents prediction. In this paper, we explore this method for intent classification on Fluent Speech Commands (FSC) dataset, where intents are formed as combinations of three slots (action, object, and location). The information of different slots will be entangled with each other in the extracted embeddings, which sometimes brings about errors in the prediction of the current slot. To address this problem, we propose partial disentangled slot embedding (PDSE) method through adversarial training. Results show that the proposed method can achieve an error rate of 0.53%, which outperforms the baseline with over 35.3% error rate reduction.

引用

页码：1062 / 1066

页数：5

共 50 条

[31] Efficient Adaptation of Spoken Language Understanding based on End-to-End Automatic Speech Recognition
Kim, Eesung
Jajodia, Aditya
Tseng, Cindy
Neelagiri, Divya
Ki, Taeyeon
Apsingekar, Vijendra Raj
INTERSPEECH 2023, 2023, : 3959 - 3963
[32] Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Cha, Sujeong
Hou, Wangrui
Jung, Hyun
Phung, My
Picheny, Michael
Kuo, Hong-Kwang J.
Thomas, Samuel
Morais, Edmilson
INTERSPEECH 2021, 2021, : 4723 - 4727
[33] Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Arora, Siddhant
Ostapenko, Alissa
Viswanathan, Vijay
Dalmia, Siddharth
Metze, Florian
Watanabe, Shinji
Black, Alan W.
INTERSPEECH 2021, 2021, : 1264 - 1268
[34] ATTENTIVE CONTEXTUAL CARRYOVER FOR MULTI-TURN END-TO-END SPOKEN LANGUAGE UNDERSTANDING
Wei, Kai
Tran, Thanh
Chang, Feng-Ju
Sathyendra, Kanthashree Mysore
Muniyappa, Thejaswi
Hu, Jing
Raju, Anirudh
McGowan, Ross
Susanj, Nathan
Rastrow, Ariya
Strimel, Grant P.
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 837 - 844
[35] TWO-STAGE TEXTUAL KNOWLEDGE DISTILLATION FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING
Kim, Seongbin
Kim, Gyuwan
Shin, Seongjin
Lee, Sangmin
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7463 - 7467
[36] END-to-END Cross-Lingual Spoken Language Understanding Model with Multilingual Pretraining
Zhang, Xianwei
He, Liang
INTERSPEECH 2021, 2021, : 4728 - 4732
[37] USE OF KERNEL DEEP CONVEX NETWORKS AND END-TO-END LEARNING FOR SPOKEN LANGUAGE UNDERSTANDING
Deng, Li
Tur, Gokhan
He, Xiaodong
Hakkani-Tur, Dilek
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 210 - 215
[38] Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech
Tomashenko, Natalia
Caubriere, Antoine
Esteve, Yannick
INTERSPEECH 2019, 2019, : 824 - 828
[39] Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding
Mesnil, Gregoire
Dauphin, Yann
Yao, Kaisheng
Bengio, Yoshua
Deng, Li
Hakkani-Tur, Dilek
He, Xiaodong
Heck, Larry
Tur, Gokhan
Yu, Dong
Zweig, Geoffrey
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (03) : 530 - 539
[40] Using Word Confusion Networks for Slot Filling in Spoken Language Understanding
Yang, Xiaohao
Liu, Jia
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1353 - 1357

← 1 2 3 4 5 →