KALA: Knowledge-Augmented Language Model Adaptation

被引:0
|
作者
Kang, Minki [1 ,2 ]
Baek, Jinheon [1 ]
Hwang, Sung Ju [1 ,2 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[2] AITRICS, Seoul, South Korea
来源
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES | 2022年
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained language models (PLMs) have achieved remarkable success on various natural language understanding tasks. Simple fine-tuning of PLMs, on the other hand, might be suboptimal for domain-specific tasks because they cannot possibly cover knowledge from all domains. While adaptive pre-training of PLMs can help them obtain domain-specific knowledge, it requires a large training cost. Moreover, adaptive pre-training can harm the PLM's performance on the downstream task by causing catastrophic forgetting of its general knowledge. To overcome such limitations of adaptive pre-training for PLM adaption, we propose a novel domain adaption framework for PLMs coined as Knowledge-Augmented Language model Adaptation (KALA), which modulates the intermediate hidden representations of PLMs with domain knowledge, consisting of entities and their relational facts. We validate the performance of our KALA on question answering and named entity recognition tasks on multiple datasets across various domains. The results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training. Code is available at: https://github.com/Nardien/KALA.
引用
收藏
页码:5144 / 5167
页数:24
相关论文
共 50 条
  • [41] Language Model Adaptation for Tiny Adaptation Corpora
    Klakow, Dietrich
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2214 - 2217
  • [42] Cross Language Information Extraction Knowledge Adaptation
    Wong, Tak-Lam
    Chow, Kai-On
    Lam, Wai
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, PROCEEDINGS, 2009, 5589 : 520 - +
  • [43] Unsupervised language model adaptation
    Bacchiani, M
    Roark, B
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 224 - 227
  • [44] MasonNLP plus at SemEval-2023 Task 8: Extracting Medical Questions, Experiences and Claims from Social Media using Knowledge-Augmented Pre-trained Language Models
    Ramachandran, Giridhar Kaushik
    Gangavarapu, Haritha
    Lybarger, Kevin
    Uzuner, Ozlem
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 2143 - 2152
  • [45] KNOWLEDGE AND THE ARTS + JNANA AND KALA
    MARAR, K
    JOURNAL OF SOUTH ASIAN LITERATURE, 1980, 15 (02) : 269 - 272
  • [46] Application of retrieval-augmented generation for interactive industrial knowledge management via a large language model
    Chen, Lun-Chi
    Pardeshi, Mayuresh Sunil
    Liao, Yi-Xiang
    Pai, Kai-Chih
    COMPUTER STANDARDS & INTERFACES, 2025, 94
  • [47] ChatENT: Augmented Large Language Model for Expert Knowledge Retrieval in Otolaryngology-Head and Neck Surgery
    Long, Cai
    Subburam, Deepak
    Lowe, Kayle
    dos Santos, Andre
    Zhang, Jessica
    Hwang, Sang
    Saduka, Neil
    Horev, Yoav
    Su, Tao
    Cote, David W. J.
    Wright, Erin D.
    OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2024, 171 (04) : 1042 - 1051
  • [48] Language model adaptation for language and dialect identification of text
    Jauhiainen, T.
    Linden, K.
    Jauhiainen, H.
    NATURAL LANGUAGE ENGINEERING, 2019, 25 (05) : 561 - 583
  • [49] Adaptation Augmented Model-based Policy Optimization
    Shen, Jian
    Lai, Hang
    Liu, Minghuan
    Zhao, Han
    Yu, Yong
    Zhang, Weinan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [50] Visual Comparison of Language Model Adaptation
    Sevastjanova R.
    Cakmak E.
    Ravfogel S.
    Cotterell R.
    El-Assady M.
    IEEE Transactions on Visualization and Computer Graphics, 2023, 29 (01) : 1178 - 1188