DRAK: Unlocking Molecular Insights with Domain-Specific Retrieval-Augmented Knowledge in LLMs

被引:0
|
作者
Liu, Jinzhe [1 ,2 ]
Huang, Xiangsheng [3 ]
Chen, Zhuo [4 ]
Fang, Yin [4 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Chinese Acad Sci, Xiongan Inst Innovat, Hebei Key Lab Cognit Intelligence, Baoding, Peoples R China
[4] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
关键词
Retrieval-augmented knowledge; Knowledge injection; Biomolecular domain; LANGUAGE;
D O I
10.1007/978-981-97-9434-8_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) typically manifest knowledge gap in specialized applications due to pre-training on generalized textual corpora. Although fine-tuning and modality alignment aim to bridge this gap, their inability to provide comprehensive knowledge coverage leads to LLMs delivering imprecise responses. To address these challenges, we introduce a scalable and adaptable non-parametric knowledge injection framework, Domain-specific Retrieval-Augmented Knowledge (DRAK), aimed at bolstering LLMs' knowledge reasoning ability through context examples. DRAK integrates retrieval enhancement and structured knowledge graph recall of high-quality instances, utilizing retrieved examples to unlock LLMs' context-relevant molecular learning capabilities, offering a universal solution for specific domains. Our validation of DRAK's effectiveness and generalizability in the biomolecular domain, achieving superior performance across twelve tasks involving both molecule-oriented and bioinformatics texts within the Mol-Instructions dataset. This demonstration of DRAK's ability to unearth molecular insights establishes a standardized approach for LLMs in navigating the complexities of knowledge-intensive challenges.
引用
收藏
页码:255 / 267
页数:13
相关论文
共 50 条
  • [1] CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge
    Tihanyi, Norbert
    Ferrag, Mohamed Amine
    Jain, Ridhi
    Bisztray, Tamas
    Debbah, Merouane
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2024, : 296 - 302
  • [2] Empowering LLMs by hybrid retrieval-augmented generation for domain-centric Q&A in smart manufacturing
    Wan, Yuwei
    Chen, Zheyuan
    Liu, Ying
    Chen, Chong
    Packianather, Michael
    ADVANCED ENGINEERING INFORMATICS, 2025, 65
  • [3] Retrieval-augmented Generation across Heterogeneous Knowledge
    Yu, Wenhao
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 52 - 58
  • [4] Learning Customized Visual Models with Retrieval-Augmented Knowledge
    Liu, Haotian
    Son, Kilho
    Yang, Jianwei
    Liu, Ce
    Gao, Jianfeng
    Lee, Yong Jae
    Li, Chunyuan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15148 - 15158
  • [5] Systematic Analysis of Retrieval-Augmented Generation-Based LLMs for Medical Chatbot Applications
    Bora, Arunabh
    Cuayahuitl, Heriberto
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (04): : 2355 - 2374
  • [6] Domain-Specific Manufacturing Analytics Framework: An Integrated Architecture with Retrieval-Augmented Generation and Ollama-Based Models for Manufacturing Execution Systems Environments
    Choi, Hangseo
    Jeong, Jongpil
    PROCESSES, 2025, 13 (03)
  • [7] Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
    Lewis, Patrick
    Perez, Ethan
    Piktus, Aleksandra
    Petroni, Fabio
    Karpukhin, Vladimir
    Goyal, Naman
    Kuttler, Heinrich
    Lewis, Mike
    Yih, Wen-tau
    Rocktaschel, Tim
    Riedel, Sebastian
    Kiela, Douwe
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [8] GenUI(ne) CRS: UI Elements and Retrieval-Augmented Generation in Conversational Recommender Systems with LLMs
    Maes, Ulysse
    Michiels, Lien
    Smets, Annelien
    PROCEEDINGS OF THE EIGHTEENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2024, 2024, : 1177 - 1179
  • [9] Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning
    Chen, Xiang
    Li, Lei
    Zhang, Ningyu
    Liang, Xiaozhuan
    Deng, Shumin
    Tan, Chuanqi
    Huang, Fei
    Si, Luo
    Chen, Huajun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [10] Generating Test Scenarios from NL Requirements using Retrieval-Augmented LLMs: An Industrial Study
    Arora, Chetan
    Herda, Tomas
    Homm, Verena
    32ND IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE, RE 2024, 2024, : 240 - 251