Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

被引:0
|
作者
Kang, Minki [1 ,2 ,5 ]
Lee, Seanie [2 ]
Baek, Jinheon [2 ]
Kawaguchi, Kenji [3 ]
Hwang, Sung Ju [2 ,4 ]
机构
[1] KRAFTON, Seongnam, South Korea
[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[3] Natl Univ Singapore, Singapore, Singapore
[4] DeepAuto Ai, Seoul, South Korea
[5] AITRICS, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to their high computational requirements and concerns on data privacy. Previous studies have focused on building task-specific small Language Models (LMs) by fine-tuning them with labeled data or distilling LLMs. However, these approaches are ill-suited for knowledge-intensive reasoning tasks due to the limited capacity of small LMs in memorizing the knowledge required. Motivated by our theoretical analysis on memorization, we propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales obtained from LLMs with augmented knowledge retrieved from an external knowledge base. Moreover, we further propose a neural reranker to obtain documents relevant to rationale generation. We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets, namely MedQA-USMLE, StrategyQA, and OpenbookQA. Notably, our method makes the 250M T5 models achieve superior performance against the fine-tuned 3B models, having 12 times larger parameters, on both MedQA-USMLE and StrategyQA benchmarks.
引用
收藏
页数:30
相关论文
共 50 条
  • [31] Knowledge-intensive organizations
    Krzyworzeka, Pawel
    E-MENTOR, 2010, (03): : 59 - 62
  • [32] A context model for knowledge-intensive case-based reasoning
    Ozturk, P
    Aamodt, A
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 1998, 48 (03) : 331 - 355
  • [33] Estimating participants for knowledge-intensive tasks in a network of crowdsourcing marketplaces
    Yiwei Gong
    Information Systems Frontiers, 2017, 19 : 301 - 319
  • [34] Estimating participants for knowledge-intensive tasks in a network of crowdsourcing marketplaces
    Gong, Yiwei
    INFORMATION SYSTEMS FRONTIERS, 2017, 19 (02) : 301 - 319
  • [35] KNOWLEDGE-INTENSIVE IT MIGRATION
    Kazimir, Peter
    Hvorecky, Jozef
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCES ON ICT, SOCIETY AND HUMAN BEINGS 2014, WEB BASED COMMUNITIES AND SOCIAL MEDIA 2014, E-COMMERCE 2014, INFORMATION SYSTEMS POST-IMPLEMENTATION AND CHANGE MANAGEMENT 2014 AND E-HEALTH 2014, 2014, : 338 - 342
  • [36] KILT: a Benchmark for Knowledge Intensive Language Tasks
    Petroni, Fabio
    Piktus, Aleksandra
    Fan, Angela
    Lewis, Patrick
    Yazdani, Majid
    De Cao, Nicola
    Thorne, James
    Jernite, Yacine
    Karpukhin, Vladimir
    Maillard, Jean
    Plachouras, Vassilis
    Rocktaschel, Tim
    Riedel, Sebastian
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2523 - 2544
  • [37] Knowledge-Intensive HRM Systems and Performance of Knowledge-Intensive Teams: Mediating Role of Team Knowledge Processes
    Shahzad, Khuram
    Hong, Ying
    Jiang, Yuan
    Niaz, Hina
    GROUP & ORGANIZATION MANAGEMENT, 2023, 48 (05) : 1430 - 1466
  • [38] A Novel Knowledge-augmented Model Customization Approach for Arabic Offensive Language Detection
    Husain, Fatemah
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (12)
  • [39] Transformer Models for Activity Mining in Knowledge-Intensive Processes
    Khandaker, Faria
    Senderovich, Arik
    Yu, Eric
    Carbajales, Sebastian
    Chan, Allen
    BUSINESS PROCESS MANAGEMENT WORKSHOPS, BPM 2022 INTERNATIONAL WORKSHOPS, 2023, 460 : 13 - 24
  • [40] Knowledge protection in knowledge-intensive business services
    Bolisani, Ettore
    Paiola, Marco
    Scarso, Enrico
    JOURNAL OF INTELLECTUAL CAPITAL, 2013, 14 (02) : 192 - +