KALA: Knowledge-Augmented Language Model Adaptation

被引：0

作者：

Kang, Minki ^{[1
,2
]}

Baek, Jinheon ^{[1
]}

Hwang, Sung Ju ^{[1
,2
]}

机构：

[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea

[2] AITRICS, Seoul, South Korea

来源：

NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES | 2022年

基金：

新加坡国家研究基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-trained language models (PLMs) have achieved remarkable success on various natural language understanding tasks. Simple fine-tuning of PLMs, on the other hand, might be suboptimal for domain-specific tasks because they cannot possibly cover knowledge from all domains. While adaptive pre-training of PLMs can help them obtain domain-specific knowledge, it requires a large training cost. Moreover, adaptive pre-training can harm the PLM's performance on the downstream task by causing catastrophic forgetting of its general knowledge. To overcome such limitations of adaptive pre-training for PLM adaption, we propose a novel domain adaption framework for PLMs coined as Knowledge-Augmented Language model Adaptation (KALA), which modulates the intermediate hidden representations of PLMs with domain knowledge, consisting of entities and their relational facts. We validate the performance of our KALA on question answering and named entity recognition tasks on multiple datasets across various domains. The results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training. Code is available at: https://github.com/Nardien/KALA.

引用

页码：5144 / 5167

页数：24

共 50 条

[41] Language Model Adaptation for Tiny Adaptation Corpora
Klakow, Dietrich
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2214 - 2217
[42] Cross Language Information Extraction Knowledge Adaptation
Wong, Tak-Lam
Chow, Kai-On
Lam, Wai
ROUGH SETS AND KNOWLEDGE TECHNOLOGY, PROCEEDINGS, 2009, 5589 : 520 - +
[43] Unsupervised language model adaptation
Bacchiani, M
Roark, B
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 224 - 227
[44] MasonNLP plus at SemEval-2023 Task 8: Extracting Medical Questions, Experiences and Claims from Social Media using Knowledge-Augmented Pre-trained Language Models
Ramachandran, Giridhar Kaushik
Gangavarapu, Haritha
Lybarger, Kevin
Uzuner, Ozlem
17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 2143 - 2152
[45] KNOWLEDGE AND THE ARTS + JNANA AND KALA
MARAR, K
JOURNAL OF SOUTH ASIAN LITERATURE, 1980, 15 (02) : 269 - 272
[46] Application of retrieval-augmented generation for interactive industrial knowledge management via a large language model
Chen, Lun-Chi
Pardeshi, Mayuresh Sunil
Liao, Yi-Xiang
Pai, Kai-Chih
COMPUTER STANDARDS & INTERFACES, 2025, 94
[47] ChatENT: Augmented Large Language Model for Expert Knowledge Retrieval in Otolaryngology-Head and Neck Surgery
Long, Cai
Subburam, Deepak
Lowe, Kayle
dos Santos, Andre
Zhang, Jessica
Hwang, Sang
Saduka, Neil
Horev, Yoav
Su, Tao
Cote, David W. J.
Wright, Erin D.
OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2024, 171 (04) : 1042 - 1051
[48] Language model adaptation for language and dialect identification of text
Jauhiainen, T.
Linden, K.
Jauhiainen, H.
NATURAL LANGUAGE ENGINEERING, 2019, 25 (05) : 561 - 583
[49] Adaptation Augmented Model-based Policy Optimization
Shen, Jian
Lai, Hang
Liu, Minghuan
Zhao, Han
Yu, Yong
Zhang, Weinan
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[50] Visual Comparison of Language Model Adaptation
Sevastjanova R.
Cakmak E.
Ravfogel S.
Cotterell R.
El-Assady M.
IEEE Transactions on Visualization and Computer Graphics, 2023, 29 (01) : 1178 - 1188

← 1 2 3 4 5 →