Impact Analysis of the Use of Speech and Language Models Pretrained by Self-Supersivion for Spoken Language Understanding

被引:0
|
作者
Mdhaffar, Salima [1 ]
Pelloin, Valentin [2 ]
Caubriere, Antoine [1 ]
Laperriere, Gaelle
Ghannay, Sahar [3 ]
Jabaian, Bassam [1 ]
Camelin, Nathalie [2 ]
Esteve, Yannick [1 ]
机构
[1] Avignon Univ, LIA, Avignon, France
[2] Le Mans Univ, LIUM, Le Mans, France
[3] Univ Paris Saclay, CNRS, LISN, Paris, France
基金
欧盟地平线“2020”;
关键词
Spoken Language Understanding; Slot Filling; Error Analysis; Self-supervised models;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Pretrained models through self-supervised learning have been recently introduced for both acoustic and language modeling. Applied to spoken language understanding tasks, these models have shown their great potential by improving the state-of-the-art performances on challenging benchmark datasets. In this paper, we present an error analysis reached by the use of such models on the French MEDIA benchmark dataset, known as being one of the most challenging benchmarks for the slot filling task among all the benchmarks accessible to the entire research community. One year ago, the state-of-art system reached a Concept Error Rate (CER) of 13.6% through the use of an end-to-end neural architecture. Some months later, a cascade approach based on the sequential use of a fine-tuned wav2vec 2.0 model and a fine-tuned BERT model reaches a CER of 11.2%. This significant improvement raises questions about the type of errors that remain difficult to treat, but also about those that have been corrected using these models pre-trained through self-supervision learning on a large amount of data. This study brings some answers in order to better understand the limits of such models and open new perspectives to continue improving the performance.
引用
收藏
页码:2949 / 2956
页数:8
相关论文
共 50 条
  • [21] ON THE USE OF MACHINE TRANSLATION FOR SPOKEN LANGUAGE UNDERSTANDING PORTABILITY
    Servan, Christophe
    Camelin, Nathalie
    Raymond, Christian
    Bechet, Frederic
    De Mori, Renato
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5330 - 5333
  • [22] Exploring the Impact of Corpus Diversity on Financial Pretrained Language Models
    Choe, Jaeyoung
    Noh, Keonwoong
    Kim, Nayeon
    Ahn, Seyun
    Jung, Woohwan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2101 - 2112
  • [23] SPEECH-LANGUAGE PRE-TRAINING FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Qian, Yao
    Bianv, Ximo
    Shi, Yu
    Kanda, Naoyuki
    Shen, Leo
    Xiao, Zhen
    Zeng, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7458 - 7462
  • [24] Rethinking the Construction of Effective Metrics for Understanding the Mechanisms of Pretrained Language Models
    Li, You
    Yin, Jinhui
    Lin, Yuming
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13399 - 13412
  • [25] Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding
    He, Mutian
    Garner, Philip N.
    INTERSPEECH 2023, 2023, : 1109 - 1113
  • [26] WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
    Gao, Heting
    Ni, Junrui
    Qian, Kaizhi
    Zhang, Yang
    Chang, Shiyu
    Hasegawa-Johnson, Mark
    INTERSPEECH 2022, 2022, : 2738 - 2742
  • [27] Utilizing Multiple Speech Recognizers to Improve Spoken Language Understanding Performance
    Choi, Stanley Jungkyu
    Lee, Kyehwan
    Hahn, Minsoo
    18TH IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS (ISCE 2014), 2014,
  • [28] Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
    Deng, Keqi
    Watanabe, Shinji
    Shi, Jiatong
    Arora, Siddhant
    INTERSPEECH 2022, 2022, : 1746 - 1750
  • [29] Parent Telegraphic Speech Use and Spoken Language in Preschoolers With ASD
    Venker, Courtney E.
    Bolt, Daniel M.
    Meyer, Allison
    Sindberg, Heidi
    Weismer, Susan Ellis
    Tager-Flusberg, Helen
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2015, 58 (06): : 1733 - 1746
  • [30] ON-LINE ADAPTATION OF SEMANTIC MODELS FOR SPOKEN LANGUAGE UNDERSTANDING
    Bayer, Ali Orkan
    Riccardi, Giuseppe
    2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 90 - 95