Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model

被引:0
|
作者
Muludi, Kurnia [1 ]
Fitria, Kaira Milani [1 ]
Triloka, Joko [1 ]
Sutedi [1 ]
机构
[1] Darmajaya Informat & Business Inst, Informat Engn Grad Program, Bandar Lampung, Indonesia
关键词
Natural Language Processing; Large Language Model; Retrieval Augmented Generation; Question Answering; GPT;
D O I
10.14569/IJACSA.2024.0150379
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This study introduces the Retrieval Augmented Generation (RAG) method to improve Question-Answering (QA) systems by addressing document processing in Natural Language Processing problems. It represents the latest breakthrough in applying RAG to document question and answer applications, overcoming previous QA system obstacles. RAG combines search techniques in vector store and text generation mechanism developed by Large Language Models, offering a time-efficient alternative to manual reading limitations. The research evaluates RAG's that use Generative Pre-trained Transformer 3.5 or GPT-3.5-turbo from the ChatGPT model and its impact on document data processing, comparing it with other applications. This research also provides datasets to test the capabilities of the QA document system. The proposed dataset and Stanford Question Answering Dataset (SQuAD) are used for performance testing. The study contributes theoretically by advancing methodologies and knowledge representation, supporting benchmarking in research communities. Results highlight RAG's superiority: achieving a precision of 0.74 in Recall-Oriented Understudy for Gisting Evaluation (ROUGE) testing, outperforming others at 0.5; obtaining an F1 score of 0.88 in BERTScore, surpassing other QA apps at 0.81; attaining a precision of 0.28 in Bilingual Evaluation Understudy (BLEU) testing, surpassing others with a precision of 0.09; and scoring 0.33 in Jaccard Similarity, outshining others at 0.04. These findings underscore RAG's efficiency and competitiveness, promising a positive impact on various industrial sectors through advanced Artificial Intelligence (AI) technology.
引用
收藏
页码:776 / 785
页数:10
相关论文
共 50 条
  • [1] Leveraging Retrieval-Augmented Generation for Reliable Medical Question Answering Using Large Language Models
    Kharitonova, Ksenia
    Perez-Fernandez, David
    Gutierrez-Hernando, Javier
    Gutierrez-Fandino, Asier
    Callejas, Zoraida
    Griol, David
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PT II, HAIS 2024, 2025, 14858 : 141 - 153
  • [2] Layered Query Retrieval: An Adaptive Framework for Retrieval-Augmented Generation in Complex Question Answering for Large Language Models
    Huang, Jie
    Wang, Mo
    Cui, Yunpeng
    Liu, Juan
    Chen, Li
    Wang, Ting
    Li, Huan
    Wu, Jinming
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [3] Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models
    Louis, Antoine
    van Dijck, Gijs
    Spanakis, Gerasimos
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22266 - 22275
  • [4] Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering
    Xu, Zhentao
    Cruz, Mark Jerome
    Guevara, Matthew
    Wang, Tie
    Deshpande, Manasi
    Wang, Xiaofeng
    Li, Zheng
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2905 - 2909
  • [5] Evaluating Retrieval-Augmented Generation Models for Financial Report Question and Answering
    Iaroshev, Ivan
    Pillai, Ramalingam
    Vaglietti, Leandro
    Hanne, Thomas
    APPLIED SCIENCES-BASEL, 2024, 14 (20):
  • [6] A Dynamic Retrieval-Augmented Generation Framework for Border Inspection Legal Question Answering
    Zhang, Yanjun
    Li, Dapeng
    Peng, Gaojun
    Guo, Shuang
    Dou, Yu
    Yi, Ruheng
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 372 - 376
  • [7] Benchmarking Large Language Models in Retrieval-Augmented Generation
    Chen, Jiawei
    Lin, Hongyu
    Han, Xianpei
    Sun, Le
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17754 - 17762
  • [8] Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models
    Hewitt, Katherine J.
    Wiest, Isabella C.
    Kather, Jakob N.
    JOURNAL OF PATHOLOGY CLINICAL RESEARCH, 2025, 11 (01):
  • [9] RAVL: A Retrieval-Augmented Visual Language Model Framework for Knowledge-Based Visual Question Answering
    Chai, Naiquan
    Zou, Dongsheng
    Liu, Jiyuan
    Wang, Hao
    Yang, Yuming
    Song, Xinyi
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 394 - 406
  • [10] Enhancing textual textbook question answering with large language models and retrieval augmented generation
    Alawwad, Hessa A.
    Alhothali, Areej
    Naseem, Usman
    Alkhathlan, Ali
    Jamal, Amani
    PATTERN RECOGNITION, 2025, 162