Retrieval-augmented Generation across Heterogeneous Knowledge

被引:0
|
作者
Yu, Wenhao [1 ]
机构
[1] Univ Notre Dame, Notre Dame, IN 46556 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Retrieval-augmented generation (RAG) methods have been receiving increasing attention from the NLP community and achieved state-of-the-art performance on many NLP downstream tasks. Compared with conventional pretrained generation models, RAG methods have remarkable advantages such as easy knowledge acquisition, strong scalability, and low training cost. Although existing RAG models have been applied to various knowledge-intensive NLP tasks, such as open-domain QA and dialogue systems, most of the work has focused on retrieving unstructured text documents from Wikipedia. In this paper, I first elaborate on the current obstacles to retrieving knowledge from a single-source homogeneous corpus. Then, I demonstrate evidence from both existing literature and my experiments, and provide multiple solutions on retrieval-augmented generation methods across heterogeneous knowledge.
引用
收藏
页码:52 / 58
页数:7
相关论文
共 50 条
  • [31] Retrieval-augmented Image Captioning
    Ramos, Rita
    Elliott, Desmond
    Martins, Bruno
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3666 - 3681
  • [32] Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation
    Han, Zifei FeiFei
    Lin, Jionghao
    Gurung, Ashish
    Thomas, Danielle R.
    Chen, Eason
    Borchers, Conrad
    Gupta, Shivang
    Koedinger, Kenneth R.
    AI FOR EDUCATION WORKSHOP, 2024, 257 : 66 - 76
  • [33] Leveraging Retrieval-Augmented Generation for Swahili Language Conversation Systems
    Ndimbo, Edmund V.
    Luo, Qin
    Fernando, Gimo C.
    Yang, Xu
    Wang, Bang
    APPLIED SCIENCES-BASEL, 2025, 15 (02):
  • [34] LLM-based and Retrieval-Augmented Control Code Generation
    Koziolek, Heiko
    Gruener, Sten
    Hark, Rhaban
    Ashiwal, Virendra
    Linsbauer, Sofia
    Eskandani, Nafise
    2024 INTERNATIONAL WORKSHOP ON LARGE LANGUAGE MODELS FOR CODE, LLM4CODE 2024, 2024, : 22 - 29
  • [35] Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
    Jiang, Wenqi
    Zeller, Marco
    Waleffe, Roger
    Hoefler, Torsten
    Alonso, Gustavo
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 18 (01): : 42 - 52
  • [36] Retrieval-Augmented Dialogue Knowledge Aggregation for expressive conversational speech synthesis
    Liu, Rui
    Jia, Zhenqi
    Bao, Feilong
    Li, Haizhou
    INFORMATION FUSION, 2025, 118
  • [37] CRP-RAG: A Retrieval-Augmented Generation Framework for Supporting Complex Logical Reasoning and Knowledge Planning
    Xu, Kehan
    Zhang, Kun
    Li, Jingyuan
    Huang, Wei
    Wang, Yuanzhuo
    ELECTRONICS, 2025, 14 (01):
  • [38] Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
    Shaol, Zhihong
    Gong, Yeyun
    Shen, Yelong
    Huang, Minlie
    Duane, Nan
    Chen, Weizhu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9248 - 9274
  • [39] Performance comparison of retrieval-augmented generation and fine-tuned large language models for construction safety management knowledge retrieval
    Lee, Jungwon
    Ahn, Seungjun
    Kim, Daeho
    Kim, Dongkyun
    AUTOMATION IN CONSTRUCTION, 2024, 168
  • [40] The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)
    Zeng, Shenglai
    Zhang, Jiankun
    He, Pengfei
    Xing, Yue
    Liu, Yiding
    Xu, Han
    Ren, Jie
    Wang, Shuaiqiang
    Yin, Dawei
    Chang, Yi
    Tang, Jiliang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 4505 - 4524