Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

被引:0
|
作者
Kang, Minki [1 ,2 ,5 ]
Lee, Seanie [2 ]
Baek, Jinheon [2 ]
Kawaguchi, Kenji [3 ]
Hwang, Sung Ju [2 ,4 ]
机构
[1] KRAFTON, Seongnam, South Korea
[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[3] Natl Univ Singapore, Singapore, Singapore
[4] DeepAuto Ai, Seoul, South Korea
[5] AITRICS, Seoul, South Korea
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to their high computational requirements and concerns on data privacy. Previous studies have focused on building task-specific small Language Models (LMs) by fine-tuning them with labeled data or distilling LLMs. However, these approaches are ill-suited for knowledge-intensive reasoning tasks due to the limited capacity of small LMs in memorizing the knowledge required. Motivated by our theoretical analysis on memorization, we propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales obtained from LLMs with augmented knowledge retrieved from an external knowledge base. Moreover, we further propose a neural reranker to obtain documents relevant to rationale generation. We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets, namely MedQA-USMLE, StrategyQA, and OpenbookQA. Notably, our method makes the 250M T5 models achieve superior performance against the fine-tuned 3B models, having 12 times larger parameters, on both MedQA-USMLE and StrategyQA benchmarks.
引用
收藏
页数:30
相关论文
共 50 条
  • [41] Knowledge management issues in knowledge-intensive SMEs
    Nunes, MB
    Annansingh, F
    Eaglestone, B
    Wakefield, R
    JOURNAL OF DOCUMENTATION, 2006, 62 (01) : 101 - 119
  • [42] Knowledge Protection in Knowledge-Intensive Business Services
    Bolisani, Ettore
    Paiola, Marco
    Scarso, Enrico
    2011 6TH INTERNATIONAL FORUM ON KNOWLEDGE ASSET DYNAMICS (IFKAD2011): KNOWLEDGE-BASED FOUNDATIONS OF THE SERVICE ECONOMY, 2011, : 1153 - 1169
  • [43] Knowledge and experience in the internationalization of knowledge-intensive firms
    Nummela, Niina
    Saarenketo, Sami
    Paavilainen-Mantymaki, Eriikka
    Puumalainen, Kaisu
    THEORY AND PRACTICE OF ENTREPRENEURSHIP: FRONTIERS IN EUROPEAN ENTREPRENEURSHIP RESEARCH, 2010, : 101 - 121
  • [44] Weakly-structured workflows for knowledge-intensive tasks: An experimental evaluation
    van Elst, L
    Aschoff, FR
    Bernardi, A
    Maus, H
    Schwarz, S
    TWELFTH IEEE INTERNATIONAL WORKSHOPS ON ENABLING TECHNOLOGIES: INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES, PROCEEDINGS, 2003, : 340 - 345
  • [45] ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models
    Zhang, Jianyi
    Muhamed, Aashiq
    Anantharaman, Aditya
    Wang, Guoyin
    Chen, Changyou
    Zhong, Kai
    Cui, Qingjun
    Xu, Yi
    Zeng, Belinda
    Chilimbi, Trishul
    Chen, Yiran
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1128 - 1136
  • [46] CAPS - A LANGUAGE FOR MODELING HIGHLY SKILLED KNOWLEDGE-INTENSIVE BEHAVIOR
    THIBADEAU, R
    BEHAVIOR RESEARCH METHODS & INSTRUMENTATION, 1983, 15 (02): : 300 - 304
  • [47] A knowledge-augmented neural network model for sarcasm detection
    Ren, Yafeng
    Wang, Zilin
    Peng, Qiong
    Ji, Donghong
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (06)
  • [48] LEARNING BY KNOWLEDGE-INTENSIVE FIRMS
    STARBUCK, WH
    JOURNAL OF MANAGEMENT STUDIES, 1992, 29 (06) : 713 - 740
  • [49] Management of knowledge-intensive companies
    den Hertog, F
    ORGANIZATION STUDIES, 1998, 19 (06) : 1053 - 1058
  • [50] Knowledge-Intensive Innovative Entrepreneurship
    Malerba, Franco
    McKelvey, Maureen
    FOUNDATIONS AND TRENDS IN ENTREPRENEURSHIP, 2018, 14 (06): : 555 - 681