DRAK: Unlocking Molecular Insights with Domain-Specific Retrieval-Augmented Knowledge in LLMs

被引：0

作者：

Liu, Jinzhe ^{[1
,2
]}

Huang, Xiangsheng ^{[3
]}

Chen, Zhuo ^{[4
]}

Fang, Yin ^{[4
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

[3] Chinese Acad Sci, Xiongan Inst Innovat, Hebei Key Lab Cognit Intelligence, Baoding, Peoples R China

[4] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China

来源：

NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT II, NLPCC 2024 | 2025年 / 15360卷

关键词：

Retrieval-augmented knowledge; Knowledge injection; Biomolecular domain; LANGUAGE;

D O I：

10.1007/978-981-97-9434-8_20

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large Language Models (LLMs) typically manifest knowledge gap in specialized applications due to pre-training on generalized textual corpora. Although fine-tuning and modality alignment aim to bridge this gap, their inability to provide comprehensive knowledge coverage leads to LLMs delivering imprecise responses. To address these challenges, we introduce a scalable and adaptable non-parametric knowledge injection framework, Domain-specific Retrieval-Augmented Knowledge (DRAK), aimed at bolstering LLMs' knowledge reasoning ability through context examples. DRAK integrates retrieval enhancement and structured knowledge graph recall of high-quality instances, utilizing retrieved examples to unlock LLMs' context-relevant molecular learning capabilities, offering a universal solution for specific domains. Our validation of DRAK's effectiveness and generalizability in the biomolecular domain, achieving superior performance across twelve tasks involving both molecule-oriented and bioinformatics texts within the Mol-Instructions dataset. This demonstration of DRAK's ability to unearth molecular insights establishes a standardized approach for LLMs in navigating the complexities of knowledge-intensive challenges.

引用

页码：255 / 267

页数：13

共 50 条

[21] Bridging the Language Gap: Domain-Specific Dataset Construction for Medical LLMs
Kim, Chae Yeon
Kim, Song Yeon
Cho, Seung Hwan
Kim, Young-Min
GENERALIZING FROM LIMITED RESOURCES IN THE OPEN WORLD, GLOW-IJCAI 2024, 2024, 2160 : 134 - 146
[22] Back to Basics - Again - for Domain-Specific Retrieval
Larson, Ray R.
EVALUATING SYSTEMS FOR MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS, 2009, 5706 : 203 - 206
[23] Conceptual language models for domain-specific retrieval
Meij, Edgar
Trieschnigg, Dolf
de Rijke, Maarten
Kraaij, Wessel
INFORMATION PROCESSING & MANAGEMENT, 2010, 46 (04) : 448 - 469
[24] Knowledge graph enhanced retrieval-augmented generation for failure mode and effects analysis
Bahr, Lukas
Wehner, Christoph
Wewerka, Judith
Bittencourt, Jose
Schmid, Ute
Daub, Ruediger
JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION, 2025, 45
[25] Improving knowledge management in building engineering with hybrid retrieval-augmented generation framework
Wang, Zhiqi
Liu, Zhongcun
Lu, Weizhen
Jia, Lu
JOURNAL OF BUILDING ENGINEERING, 2025, 103
[26] Domain-Specific Information Retrieval Using Recommenders
Li, Wei
PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1327 - 1327
[27] Domain-specific knowledge graphs: A survey
Abu-Salih, Bilal
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2021, 185
[28] Adapt in Contexts: Retrieval-Augmented Domain Adaptation via In-Context Learning
Long, Quanyu
Wang, Wenya
Pan, Sinno Jialin
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6525 - 6542
[29] ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models
Zhang, Jianyi
Muhamed, Aashiq
Anantharaman, Aditya
Wang, Guoyin
Chen, Changyou
Zhong, Kai
Cui, Qingjun
Xu, Yi
Zeng, Belinda
Chilimbi, Trishul
Chen, Yiran
61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1128 - 1136
[30] Fine-grained knowledge fusion for retrieval-augmented medical visual question answering
Liang, Xiao
Wang, Di
Jing, Bin
Jiao, Zhicheng
Li, Ronghan
Liu, Ruyi
Miao, Qiguang
Wang, Quan
INFORMATION FUSION, 2025, 120

← 1 2 3 4 5 →