On the N-gram Approximation of Pre-trained Language Models

被引:1
|
作者
Krishnan, Aravind [1 ,2 ]
Alabi, Jesujoba O. [1 ,3 ]
Klakow, Dietrich [1 ]
机构
[1] Saarland Univ, Spoken Language Syst Grp, Saarbrucken, Germany
[2] German Res Ctr Artificial Intelligence DFKI, Kaiserslautern, Germany
[3] Saarland Informat Campus, Saarbrucken, Germany
来源
关键词
domain adaptation; approximation; GPT-2;
D O I
10.21437/Interspeech.2023-2182
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large pre-trained language models (PLMs) have shown remarkable performance across various natural language understanding (NLU) tasks, particularly in low-resource settings. Nevertheless, their potential in Automatic Speech Recognition (ASR) remains largely unexplored. This study investigates the potential usage of PLMs for language modelling in ASR. We compare the application of large-scale text sampling and probability conversion for approximating GPT-2 into an n-gram model. Furthermore, we introduce a vocabulary-restricted decoding method for random sampling, and evaluate the effects of domain difficulty and data size on the usability of generated text. Our findings across eight domain-specific corpora support the use of sampling-based approximation and show that interpolating with a large sampled corpus improves test perplexity over a baseline trigram by 15%. Our vocabulary-restricted decoding method pushes this improvement further by 5% in domainspecific settings.
引用
收藏
页码:371 / 375
页数:5
相关论文
共 50 条
  • [31] A Survey of Knowledge Enhanced Pre-Trained Language Models
    Hu, Linmei
    Liu, Zeyi
    Zhao, Ziwang
    Hou, Lei
    Nie, Liqiang
    Li, Juanzi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
  • [32] Exploring Robust Overfitting for Pre-trained Language Models
    Zhu, Bin
    Rao, Yanghui
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5506 - 5522
  • [33] Self-conditioning Pre-Trained Language Models
    Suau, Xavier
    Zappella, Luca
    Apostoloff, Nicholas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [34] Commonsense Knowledge Transfer for Pre-trained Language Models
    Zhou, Wangchunshu
    Le Bras, Ronan
    Choi, Yejin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5946 - 5960
  • [35] Pre-trained models for natural language processing: A survey
    QIU XiPeng
    SUN TianXiang
    XU YiGe
    SHAO YunFan
    DAI Ning
    HUANG XuanJing
    Science China(Technological Sciences), 2020, 63 (10) : 1872 - 1897
  • [36] Evaluating the Summarization Comprehension of Pre-Trained Language Models
    Chernyshev, D. I.
    Dobrov, B. V.
    LOBACHEVSKII JOURNAL OF MATHEMATICS, 2023, 44 (08) : 3028 - 3039
  • [37] Pre-trained language models: What do they know?
    Guimaraes, Nuno
    Campos, Ricardo
    Jorge, Alipio
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 14 (01)
  • [38] Capturing Semantics for Imputation with Pre-trained Language Models
    Mei, Yinan
    Song, Shaoxu
    Fang, Chenguang
    Yang, Haifeng
    Fang, Jingyun
    Long, Jiang
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 61 - 72
  • [39] Empowering News Recommendation with Pre-trained Language Models
    Wu, Chuhan
    Wu, Fangzhao
    Qi, Tao
    Huang, Yongfeng
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1652 - 1656
  • [40] Understanding Online Attitudes with Pre-Trained Language Models
    Power, William
    Obradovic, Zoran
    PROCEEDINGS OF THE 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2023, 2023, : 745 - 752