Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model

被引：0

作者：

Muludi, Kurnia ^{[1
]}

Fitria, Kaira Milani ^{[1
]}

Triloka, Joko ^{[1
]}

Sutedi ^{[1
]}

机构：

[1] Darmajaya Informat & Business Inst, Informat Engn Grad Program, Bandar Lampung, Indonesia

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2024年 / 15卷 / 03期

关键词：

Natural Language Processing; Large Language Model; Retrieval Augmented Generation; Question Answering; GPT;

D O I：

10.14569/IJACSA.2024.0150379

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This study introduces the Retrieval Augmented Generation (RAG) method to improve Question-Answering (QA) systems by addressing document processing in Natural Language Processing problems. It represents the latest breakthrough in applying RAG to document question and answer applications, overcoming previous QA system obstacles. RAG combines search techniques in vector store and text generation mechanism developed by Large Language Models, offering a time-efficient alternative to manual reading limitations. The research evaluates RAG's that use Generative Pre-trained Transformer 3.5 or GPT-3.5-turbo from the ChatGPT model and its impact on document data processing, comparing it with other applications. This research also provides datasets to test the capabilities of the QA document system. The proposed dataset and Stanford Question Answering Dataset (SQuAD) are used for performance testing. The study contributes theoretically by advancing methodologies and knowledge representation, supporting benchmarking in research communities. Results highlight RAG's superiority: achieving a precision of 0.74 in Recall-Oriented Understudy for Gisting Evaluation (ROUGE) testing, outperforming others at 0.5; obtaining an F1 score of 0.88 in BERTScore, surpassing other QA apps at 0.81; attaining a precision of 0.28 in Bilingual Evaluation Understudy (BLEU) testing, surpassing others with a precision of 0.09; and scoring 0.33 in Jaccard Similarity, outshining others at 0.04. These findings underscore RAG's efficiency and competitiveness, promising a positive impact on various industrial sectors through advanced Artificial Intelligence (AI) technology.

引用

页码：776 / 785

页数：10

共 50 条

[31] Using the Retrieval-Augmented Generation to Improve the Question-Answering System in Human Health Risk Assessment: The Development and Application
Meng, Wenjun
Li, Yuzhe
Chen, Lili
Dong, Zhaomin
ELECTRONICS, 2025, 14 (02):
[32] TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models
Shanghai Jiao Tong University, China
arXiv,
[33] Query Rewriting for Retrieval-Augmented Large Language Models
Ma, Xinbei
Gong, Yeyun
He, Pengcheng
Zhao, Hai
Duan, Nan
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5303 - 5315
[34] SafetyRAG: Towards Safe Large Language Model-Based Application through Retrieval-Augmented Generation
Omri, Sihem
Abdelkader, Manel
Hamdi, Mohamed
JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2025, 16 (02) : 243 - 250
[35] Supercharging Document Composition with Generative AI: A Secure, Custom Retrieval-Augmented Generation Approach
Chen, Andre
Tran, Sieu
2024 11TH IEEE SWISS CONFERENCE ON DATA SCIENCE, SDS 2024, 2024, : 123 - 130
[36] Evaluation of the integration of retrieval-augmented generation in large language model for breast cancer nursing care responses
Xu, Ruiyu
Hong, Ying
Zhang, Feifei
Xu, Hongmei
SCIENTIFIC REPORTS, 2024, 14 (01):
[37] SurgeryLLM: a retrieval-augmented generation large language model framework for surgical decision support and workflow enhancement
Ong, Chin Siang
Obey, Nicholas T.
Zheng, Yanan
Cohan, Arman
Schneider, Eric B.
npj Digital Medicine, 2024, 7 (01)
[38] Retrieval-Augmented Generation-aided causal identification of aviation accidents: A large language model methodology
Ren, Tengfei
Zhang, Zhipeng
Jia, Bo
Zhang, Shiwen
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 278
[39] Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature
Lozano, Alejandro
Fleming, Scott L.
Chiang, Chia-Chun
Shah, Nigam
BIOCOMPUTING 2024, PSB 2024, 2024, : 8 - 23
[40] Adaptive Control of Retrieval-Augmented Generation for Large Language Models Through Reflective Tags
Yao, Chengyuan
Fujita, Satoshi
ELECTRONICS, 2024, 13 (23):

← 1 2 3 4 5 →