Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model

被引:0
|
作者
Muludi, Kurnia [1 ]
Fitria, Kaira Milani [1 ]
Triloka, Joko [1 ]
Sutedi [1 ]
机构
[1] Darmajaya Informat & Business Inst, Informat Engn Grad Program, Bandar Lampung, Indonesia
关键词
Natural Language Processing; Large Language Model; Retrieval Augmented Generation; Question Answering; GPT;
D O I
10.14569/IJACSA.2024.0150379
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This study introduces the Retrieval Augmented Generation (RAG) method to improve Question-Answering (QA) systems by addressing document processing in Natural Language Processing problems. It represents the latest breakthrough in applying RAG to document question and answer applications, overcoming previous QA system obstacles. RAG combines search techniques in vector store and text generation mechanism developed by Large Language Models, offering a time-efficient alternative to manual reading limitations. The research evaluates RAG's that use Generative Pre-trained Transformer 3.5 or GPT-3.5-turbo from the ChatGPT model and its impact on document data processing, comparing it with other applications. This research also provides datasets to test the capabilities of the QA document system. The proposed dataset and Stanford Question Answering Dataset (SQuAD) are used for performance testing. The study contributes theoretically by advancing methodologies and knowledge representation, supporting benchmarking in research communities. Results highlight RAG's superiority: achieving a precision of 0.74 in Recall-Oriented Understudy for Gisting Evaluation (ROUGE) testing, outperforming others at 0.5; obtaining an F1 score of 0.88 in BERTScore, surpassing other QA apps at 0.81; attaining a precision of 0.28 in Bilingual Evaluation Understudy (BLEU) testing, surpassing others with a precision of 0.09; and scoring 0.33 in Jaccard Similarity, outshining others at 0.04. These findings underscore RAG's efficiency and competitiveness, promising a positive impact on various industrial sectors through advanced Artificial Intelligence (AI) technology.
引用
收藏
页码:776 / 785
页数:10
相关论文
共 50 条
  • [21] KGC-RAG: Knowledge Graph Construction from Large Language Model Using Retrieval-Augmented Generation
    Prabhong, Thin
    Kertkeidkachorn, Natthawut
    Trongratsameethong, Areerat
    CEUR Workshop Proceedings, 2024, 3853
  • [22] An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models
    Wang, Mengzhao
    Wu, Haotian
    Ke, Xiangyu
    Gao, Yunjun
    Xu, Xiaoliang
    Chen, Lu
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (12): : 4333 - 4336
  • [23] Injury degree appraisal of large language model based on retrieval-augmented generation and deep learning
    Zhang, Fan
    Luo, Yifang
    Gao, Zihuan
    Han, Aihua
    INTERNATIONAL JOURNAL OF LAW AND PSYCHIATRY, 2025, 100
  • [24] Mapping Drug Terms via Integration of a Retrieval-Augmented Generation Algorithm with a Large Language Model
    Kimura, Eizen
    Kawakami, Yukinobu
    Inoue, Shingo
    Okajima, Ai
    HEALTHCARE INFORMATICS RESEARCH, 2024, 30 (04) : 355 - 363
  • [25] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
    Wang, Kevin Shukang
    Lawrence, Ramon
    PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 2, 2025, : 1183 - 1189
  • [26] Facilitating university admission using a chatbot based on large language models with retrieval-augmented generation
    Chen, Zheng
    Zou, Di
    Xie, Haoran
    Lou, Huajie
    Pang, Zhiyuan
    EDUCATIONAL TECHNOLOGY & SOCIETY, 2024, 27 (04): : 454 - 470
  • [27] Development and Evaluation of a Retrieval-Augmented Large Language Model Framework for Ophthalmology
    Luo, Ming-Jie
    Pang, Jianyu
    Bi, Shaowei
    Lai, Yunxi
    Zhao, Jiaman
    Shang, Yuanrui
    Cui, Tingxin
    Yang, Yahan
    Lin, Zhenzhe
    Zhao, Lanqin
    Wu, Xiaohang
    Lin, Duoru
    Chen, Jingjing
    Lin, Haotian
    JAMA OPHTHALMOLOGY, 2024, 142 (09) : 798 - 805
  • [28] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
    Wang, Kevin Shukang
    Lawrence, Ramon
    PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1, 2025, : 1183 - 1189
  • [29] Natural language Question - Answering model applied to document retrieval system
    Dang, Nguyen Tuan
    Tuyen, Do Thi Thanh
    World Academy of Science, Engineering and Technology, 2009, 39 : 36 - 39
  • [30] Integrating Graph Retrieval-Augmented Generation With Large Language Models for Supplier Discovery
    Li, Yunqing
    Ko, Hyunwoong
    Ameri, Farhad
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2025, 25 (02)