Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model

被引:0
|
作者
Muludi, Kurnia [1 ]
Fitria, Kaira Milani [1 ]
Triloka, Joko [1 ]
Sutedi [1 ]
机构
[1] Darmajaya Informat & Business Inst, Informat Engn Grad Program, Bandar Lampung, Indonesia
关键词
Natural Language Processing; Large Language Model; Retrieval Augmented Generation; Question Answering; GPT;
D O I
10.14569/IJACSA.2024.0150379
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This study introduces the Retrieval Augmented Generation (RAG) method to improve Question-Answering (QA) systems by addressing document processing in Natural Language Processing problems. It represents the latest breakthrough in applying RAG to document question and answer applications, overcoming previous QA system obstacles. RAG combines search techniques in vector store and text generation mechanism developed by Large Language Models, offering a time-efficient alternative to manual reading limitations. The research evaluates RAG's that use Generative Pre-trained Transformer 3.5 or GPT-3.5-turbo from the ChatGPT model and its impact on document data processing, comparing it with other applications. This research also provides datasets to test the capabilities of the QA document system. The proposed dataset and Stanford Question Answering Dataset (SQuAD) are used for performance testing. The study contributes theoretically by advancing methodologies and knowledge representation, supporting benchmarking in research communities. Results highlight RAG's superiority: achieving a precision of 0.74 in Recall-Oriented Understudy for Gisting Evaluation (ROUGE) testing, outperforming others at 0.5; obtaining an F1 score of 0.88 in BERTScore, surpassing other QA apps at 0.81; attaining a precision of 0.28 in Bilingual Evaluation Understudy (BLEU) testing, surpassing others with a precision of 0.09; and scoring 0.33 in Jaccard Similarity, outshining others at 0.04. These findings underscore RAG's efficiency and competitiveness, promising a positive impact on various industrial sectors through advanced Artificial Intelligence (AI) technology.
引用
收藏
页码:776 / 785
页数:10
相关论文
共 50 条
  • [31] Using the Retrieval-Augmented Generation to Improve the Question-Answering System in Human Health Risk Assessment: The Development and Application
    Meng, Wenjun
    Li, Yuzhe
    Chen, Lili
    Dong, Zhaomin
    ELECTRONICS, 2025, 14 (02):
  • [32] TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models
    Shanghai Jiao Tong University, China
    arXiv,
  • [33] Query Rewriting for Retrieval-Augmented Large Language Models
    Ma, Xinbei
    Gong, Yeyun
    He, Pengcheng
    Zhao, Hai
    Duan, Nan
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5303 - 5315
  • [34] SafetyRAG: Towards Safe Large Language Model-Based Application through Retrieval-Augmented Generation
    Omri, Sihem
    Abdelkader, Manel
    Hamdi, Mohamed
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2025, 16 (02) : 243 - 250
  • [35] Supercharging Document Composition with Generative AI: A Secure, Custom Retrieval-Augmented Generation Approach
    Chen, Andre
    Tran, Sieu
    2024 11TH IEEE SWISS CONFERENCE ON DATA SCIENCE, SDS 2024, 2024, : 123 - 130
  • [36] Evaluation of the integration of retrieval-augmented generation in large language model for breast cancer nursing care responses
    Xu, Ruiyu
    Hong, Ying
    Zhang, Feifei
    Xu, Hongmei
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [37] SurgeryLLM: a retrieval-augmented generation large language model framework for surgical decision support and workflow enhancement
    Ong, Chin Siang
    Obey, Nicholas T.
    Zheng, Yanan
    Cohan, Arman
    Schneider, Eric B.
    npj Digital Medicine, 2024, 7 (01)
  • [38] Retrieval-Augmented Generation-aided causal identification of aviation accidents: A large language model methodology
    Ren, Tengfei
    Zhang, Zhipeng
    Jia, Bo
    Zhang, Shiwen
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 278
  • [39] Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature
    Lozano, Alejandro
    Fleming, Scott L.
    Chiang, Chia-Chun
    Shah, Nigam
    BIOCOMPUTING 2024, PSB 2024, 2024, : 8 - 23
  • [40] Adaptive Control of Retrieval-Augmented Generation for Large Language Models Through Reflective Tags
    Yao, Chengyuan
    Fujita, Satoshi
    ELECTRONICS, 2024, 13 (23):