KALA: Knowledge-Augmented Language Model Adaptation

被引:0
|
作者
Kang, Minki [1 ,2 ]
Baek, Jinheon [1 ]
Hwang, Sung Ju [1 ,2 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[2] AITRICS, Seoul, South Korea
来源
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES | 2022年
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained language models (PLMs) have achieved remarkable success on various natural language understanding tasks. Simple fine-tuning of PLMs, on the other hand, might be suboptimal for domain-specific tasks because they cannot possibly cover knowledge from all domains. While adaptive pre-training of PLMs can help them obtain domain-specific knowledge, it requires a large training cost. Moreover, adaptive pre-training can harm the PLM's performance on the downstream task by causing catastrophic forgetting of its general knowledge. To overcome such limitations of adaptive pre-training for PLM adaption, we propose a novel domain adaption framework for PLMs coined as Knowledge-Augmented Language model Adaptation (KALA), which modulates the intermediate hidden representations of PLMs with domain knowledge, consisting of entities and their relational facts. We validate the performance of our KALA on question answering and named entity recognition tasks on multiple datasets across various domains. The results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training. Code is available at: https://github.com/Nardien/KALA.
引用
收藏
页码:5144 / 5167
页数:24
相关论文
共 50 条
  • [21] Knowledge-augmented Graph Machine Learning for Drug Discovery: From Precision to Interpretability
    Zhong, Zhiqiang
    Mottin, Davide
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 5841 - 5842
  • [22] Contrastive knowledge-augmented self-distillation approach for few-shot learning
    Zhang, Lixu
    Shao, Mingwen
    Chen, Sijie
    Liu, Fukang
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (05)
  • [23] Knowledge-Augmented Interpretable Network for Zero-Shot Stance Detection on Social Media
    Zhang, Bowen
    Ding, Daijun
    Huang, Zhichao
    Li, Ang
    Li, Yangyang
    Zhang, Baoquan
    Huang, Hu
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, : 1 - 12
  • [24] Prior Knowledge-Augmented Meta-Learning for Fine-Grained Fault Diagnosis
    Zhou, Yuhang
    Zhang, Qiang
    Huang, Ting
    Cai, Zhengyang
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (06) : 8115 - 8124
  • [25] View-Based Knowledge-Augmented Multimodal Semantic Understanding for Optical Remote Sensing Images
    Zhu, Lilu
    Su, Xiaolu
    Tang, Jiaxuan
    Hu, Yanfeng
    Wang, Yang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [26] EIKA: Explicit & Implicit Knowledge-Augmented Network for entity-aware sports video captioning
    Xi, Zeyu
    Shi, Ge
    Sun, Haoying
    Zhang, Bowen
    Li, Shuyi
    Wu, Lifang
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 274
  • [27] A knowledge-augmented heterogeneous graph convolutional network for aspect-level multimodal sentiment analysis
    Yujie, Wan
    Yuzhong, Chen
    Jiali, Lin
    Jiayuan, Zhong
    Chen, Dong
    COMPUTER SPEECH AND LANGUAGE, 2024, 85
  • [28] KaTaGCN: Knowledge-Augmented and Time-Aware Graph Convolutional Network for efficient traffic forecasting
    Wang, Yuyan
    Hu, Jie
    Teng, Fei
    Peng, Lilan
    Du, Shengdong
    Li, Tianrui
    INFORMATION FUSION, 2024, 111
  • [29] Analyzing Surveillance Videos using automatically generated processing sequences with Knowledge-Augmented Genetic Algorithms
    Samarabandu, Jagath
    Ranaweera, Kamal
    2016 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2016,
  • [30] Semantic Similarity Measurement Using Knowledge-Augmented Multiple-prototype Distributed Word Vector
    Lu, Wei
    Shi, Kailun
    Cai, Yuanyuan
    Che, Xiaoping
    INTERNATIONAL JOURNAL OF INTERDISCIPLINARY TELECOMMUNICATIONS AND NETWORKING, 2016, 8 (02) : 45 - 57