DRAK: Unlocking Molecular Insights with Domain-Specific Retrieval-Augmented Knowledge in LLMs

被引:0
|
作者
Liu, Jinzhe [1 ,2 ]
Huang, Xiangsheng [3 ]
Chen, Zhuo [4 ]
Fang, Yin [4 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Chinese Acad Sci, Xiongan Inst Innovat, Hebei Key Lab Cognit Intelligence, Baoding, Peoples R China
[4] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
关键词
Retrieval-augmented knowledge; Knowledge injection; Biomolecular domain; LANGUAGE;
D O I
10.1007/978-981-97-9434-8_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) typically manifest knowledge gap in specialized applications due to pre-training on generalized textual corpora. Although fine-tuning and modality alignment aim to bridge this gap, their inability to provide comprehensive knowledge coverage leads to LLMs delivering imprecise responses. To address these challenges, we introduce a scalable and adaptable non-parametric knowledge injection framework, Domain-specific Retrieval-Augmented Knowledge (DRAK), aimed at bolstering LLMs' knowledge reasoning ability through context examples. DRAK integrates retrieval enhancement and structured knowledge graph recall of high-quality instances, utilizing retrieved examples to unlock LLMs' context-relevant molecular learning capabilities, offering a universal solution for specific domains. Our validation of DRAK's effectiveness and generalizability in the biomolecular domain, achieving superior performance across twelve tasks involving both molecule-oriented and bioinformatics texts within the Mol-Instructions dataset. This demonstration of DRAK's ability to unearth molecular insights establishes a standardized approach for LLMs in navigating the complexities of knowledge-intensive challenges.
引用
收藏
页码:255 / 267
页数:13
相关论文
共 50 条
  • [21] Bridging the Language Gap: Domain-Specific Dataset Construction for Medical LLMs
    Kim, Chae Yeon
    Kim, Song Yeon
    Cho, Seung Hwan
    Kim, Young-Min
    GENERALIZING FROM LIMITED RESOURCES IN THE OPEN WORLD, GLOW-IJCAI 2024, 2024, 2160 : 134 - 146
  • [22] Back to Basics - Again - for Domain-Specific Retrieval
    Larson, Ray R.
    EVALUATING SYSTEMS FOR MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS, 2009, 5706 : 203 - 206
  • [23] Conceptual language models for domain-specific retrieval
    Meij, Edgar
    Trieschnigg, Dolf
    de Rijke, Maarten
    Kraaij, Wessel
    INFORMATION PROCESSING & MANAGEMENT, 2010, 46 (04) : 448 - 469
  • [24] Knowledge graph enhanced retrieval-augmented generation for failure mode and effects analysis
    Bahr, Lukas
    Wehner, Christoph
    Wewerka, Judith
    Bittencourt, Jose
    Schmid, Ute
    Daub, Ruediger
    JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION, 2025, 45
  • [25] Improving knowledge management in building engineering with hybrid retrieval-augmented generation framework
    Wang, Zhiqi
    Liu, Zhongcun
    Lu, Weizhen
    Jia, Lu
    JOURNAL OF BUILDING ENGINEERING, 2025, 103
  • [26] Domain-Specific Information Retrieval Using Recommenders
    Li, Wei
    PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1327 - 1327
  • [27] Domain-specific knowledge graphs: A survey
    Abu-Salih, Bilal
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2021, 185
  • [28] Adapt in Contexts: Retrieval-Augmented Domain Adaptation via In-Context Learning
    Long, Quanyu
    Wang, Wenya
    Pan, Sinno Jialin
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6525 - 6542
  • [29] ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models
    Zhang, Jianyi
    Muhamed, Aashiq
    Anantharaman, Aditya
    Wang, Guoyin
    Chen, Changyou
    Zhong, Kai
    Cui, Qingjun
    Xu, Yi
    Zeng, Belinda
    Chilimbi, Trishul
    Chen, Yiran
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1128 - 1136
  • [30] Fine-grained knowledge fusion for retrieval-augmented medical visual question answering
    Liang, Xiao
    Wang, Di
    Jing, Bin
    Jiao, Zhicheng
    Li, Ronghan
    Liu, Ruyi
    Miao, Qiguang
    Wang, Quan
    INFORMATION FUSION, 2025, 120