On the N-gram Approximation of Pre-trained Language Models

被引:1
|
作者
Krishnan, Aravind [1 ,2 ]
Alabi, Jesujoba O. [1 ,3 ]
Klakow, Dietrich [1 ]
机构
[1] Saarland Univ, Spoken Language Syst Grp, Saarbrucken, Germany
[2] German Res Ctr Artificial Intelligence DFKI, Kaiserslautern, Germany
[3] Saarland Informat Campus, Saarbrucken, Germany
来源
关键词
domain adaptation; approximation; GPT-2;
D O I
10.21437/Interspeech.2023-2182
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large pre-trained language models (PLMs) have shown remarkable performance across various natural language understanding (NLU) tasks, particularly in low-resource settings. Nevertheless, their potential in Automatic Speech Recognition (ASR) remains largely unexplored. This study investigates the potential usage of PLMs for language modelling in ASR. We compare the application of large-scale text sampling and probability conversion for approximating GPT-2 into an n-gram model. Furthermore, we introduce a vocabulary-restricted decoding method for random sampling, and evaluate the effects of domain difficulty and data size on the usability of generated text. Our findings across eight domain-specific corpora support the use of sampling-based approximation and show that interpolating with a large sampled corpus improves test perplexity over a baseline trigram by 15%. Our vocabulary-restricted decoding method pushes this improvement further by 5% in domainspecific settings.
引用
收藏
页码:371 / 375
页数:5
相关论文
共 50 条
  • [41] Memorisation versus Generalisation in Pre-trained Language Models
    Tanzer, Michael
    Ruder, Sebastian
    Rei, Marek
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7564 - 7578
  • [42] Context Analysis for Pre-trained Masked Language Models
    Lai, Yi-An
    Lalwani, Garima
    Zhang, Yi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3789 - 3804
  • [43] Exploring Lottery Prompts for Pre-trained Language Models
    Chen, Yulin
    Ding, Ning
    Wang, Xiaobin
    Hu, Shengding
    Zheng, Hai-Tao
    Liu, Zhiyuan
    Xie, Pengjun
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15428 - 15444
  • [44] On the Sentence Embeddings from Pre-trained Language Models
    Li, Bohan
    Zhou, Hao
    He, Junxian
    Wang, Mingxuan
    Yang, Yiming
    Li, Lei
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9119 - 9130
  • [45] Compressing Pre-trained Language Models by Matrix Decomposition
    Ben Noach, Matan
    Goldberg, Yoav
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 884 - 889
  • [46] Machine Unlearning of Pre-trained Large Language Models
    Yao, Jin
    Chien, Eli
    Du, Minxin
    Niu, Xinyao
    Wang, Tianhao
    Cheng, Zezhou
    Yue, Xiang
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 8403 - 8419
  • [47] Pre-trained language models for keyphrase prediction: A review
    Umair, Muhammad
    Sultana, Tangina
    Lee, Young-Koo
    ICT EXPRESS, 2024, 10 (04): : 871 - 890
  • [48] Pre-trained models for natural language processing: A survey
    QIU XiPeng
    SUN TianXiang
    XU YiGe
    SHAO YunFan
    DAI Ning
    HUANG XuanJing
    Science China(Technological Sciences), 2020, (10) : 1872 - 1897
  • [49] Robust Lottery Tickets for Pre-trained Language Models
    Zheng, Rui
    Bao, Rong
    Zhou, Yuhao
    Liang, Di
    Wane, Sirui
    Wu, Wei
    Gui, Tao
    Zhang, Qi
    Huang, Xuanjing
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2211 - 2224
  • [50] Pre-Trained Language Models for Text Generation: A Survey
    Li, Junyi
    Tang, Tianyi
    Zhao, Wayne Xin
    Nie, Jian-Yun
    Wen, Ji-Rong
    ACM COMPUTING SURVEYS, 2024, 56 (09)