Learning Customized Visual Models with Retrieval-Augmented Knowledge

被引:5
|
作者
Liu, Haotian [1 ]
Son, Kilho [2 ]
Yang, Jianwei [2 ]
Liu, Ce [2 ]
Gao, Jianfeng [2 ]
Lee, Yong Jae [1 ]
Li, Chunyuan [2 ]
机构
[1] Univ Wisconsin Madison, Madison, WI 53706 USA
[2] Microsoft, Redmond, WA USA
关键词
D O I
10.1109/CVPR52729.2023.01454
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image-text contrastive learning models such as CLIP have demonstrated strong task transfer ability. The high generality and usability of these visual models is achieved via a web-scale data collection process to ensure broad concept coverage, followed by expensive pre-training to feed all the knowledge into model weights. Alternatively, we propose REACT, REtrieval-Augmented CusTomization, a framework to acquire the relevant web knowledge to build customized visual models for target domains. We retrieve the most relevant image-text pairs (similar to 3% of CLIP pre-training data) from the web-scale database as external knowledge and propose to customize the model by only training new modularized blocks while freezing all the original weights. The effectiveness of REACT is demonstrated via extensive experiments on classification, retrieval, detection and segmentation tasks, including zero, few, and full-shot settings. Particularly, on the zero-shot classification task, compared with CLIP, it achieves up to 5.4% improvement on ImageNet and 3.7% on the ELEVATER benchmark (20 datasets).
引用
收藏
页码:15148 / 15158
页数:11
相关论文
共 50 条
  • [21] Retrieval-Augmented Dialogue Knowledge Aggregation for expressive conversational speech synthesis
    Liu, Rui
    Jia, Zhenqi
    Bao, Feilong
    Li, Haizhou
    INFORMATION FUSION, 2025, 118
  • [22] Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering
    Xu, Zhentao
    Cruz, Mark Jerome
    Guevara, Matthew
    Wang, Tie
    Deshpande, Manasi
    Wang, Xiaofeng
    Li, Zheng
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2905 - 2909
  • [23] Retrieval-augmented large language models for clinical trial screening.
    He, Jianqiao
    Gai, Shanglei
    Ho, Si Xian
    Chua, Shi Ling
    Oo, Viviana
    Zaw, Ma Wai Wai
    Tan, Daniel Shao-Weng
    Tan, Ryan
    JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (23_SUPPL) : 157 - 157
  • [24] Performance comparison of retrieval-augmented generation and fine-tuned large language models for construction safety management knowledge retrieval
    Lee, Jungwon
    Ahn, Seungjun
    Kim, Daeho
    Kim, Dongkyun
    AUTOMATION IN CONSTRUCTION, 2024, 168
  • [25] Benchmarking Retrieval-Augmented Generation for Medicine
    Xiong, Guangzhi
    Jin, Qiao
    Lu, Zhiyong
    Zhang, Aidong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 6233 - 6251
  • [26] Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
    Shaol, Zhihong
    Gong, Yeyun
    Shen, Yelong
    Huang, Minlie
    Duane, Nan
    Chen, Weizhu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9248 - 9274
  • [27] Toward Robust RALMs: Revealing the Impact of Imperfect Retrieval on Retrieval-Augmented Language Models
    Park, Seong-Il
    Lee, Jay-Yoon
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1686 - 1702
  • [28] Retrieval-Augmented Response Generation for Knowledge-Grounded Conversation in the Wild
    Ahn, Yeonchan
    Lee, Sang-Goo
    Shim, Junho
    Park, Jaehui
    IEEE ACCESS, 2022, 10 : 131374 - 131385
  • [29] Retrieval-augmented large language models for clinical trial screening.
    Tan, Ryan
    Ho, Si Xian
    Oo, Shiyun Vivianna Fequira
    Chua, Shi Ling
    Zaw, Ma Wai Wai
    Tan, Daniel Shao-Weng
    JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (16)
  • [30] Utilizing Retrieval-Augmented Large Language Models for Pregnancy Nutrition Advice
    Bano, Taranum
    Vadapalli, Jagadeesh
    Karki, Bishwa
    Thoene, Melissa K.
    VanOrmer, Matt
    Berry, Ann L. Anderson
    Tsai, Chun-Hua
    NEW TRENDS IN DISRUPTIVE TECHNOLOGIES, TECH ETHICS, AND ARTIFICIAL INTELLIGENCE, DITTET 2024, 2024, 1459 : 85 - 96