Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning

被引：0

作者：

Li, Wenyan ^{[1
]}

Li, Jiaang ^{[1
]}

Ramose, Rita ^{[2
]}

Tang, Raphael ^{[3
]}

Elliott, Desmond ^{[1
]}

机构：

[1] Univ Copenhagen, Dept Comp Sci, Copenhagen, Denmark

[2] Univ Lisbon, Inst Super Tecn, NESC ID, Lisbon, Portugal

[3] Comcast Appl AI, Philadelphia, PA USA

来源：

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent advances in retrieval-augmented models for image captioning highlight the benefit of retrieving related captions for efficient, lightweight models with strong domain-transfer capabilities. While these models demonstrate the success of retrieval augmentation, retrieval models are still far from perfect in practice: the retrieved information can sometimes mislead the model, resulting in incorrect generation and worse performance. In this paper, we analyze the robustness of a retrieval-augmented captioning model SMALLCAP. Our analysis shows that the model is sensitive to tokens that appear in the majority of the retrieved captions, and the input attribution shows that those tokens are likely copied into the generated output. Given these findings, we propose to train the model by sampling retrieved captions from more diverse sets. This decreases the chance that the model learns to copy majority tokens, and improves both in-domain and cross-domain performance.

引用

页码：9285 / 9299

页数：15

共 50 条

[31] Diversify Question Generation with Retrieval-Augmented Style Transfer
Gou, Qi
Xia, Zehua
Yu, Bowen
Yu, Haiyang
Huang, Fei
Li, Yongbin
Nguyen, Cam-Tu
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1677 - 1690
[32] Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
Shaol, Zhihong
Gong, Yeyun
Shen, Yelong
Huang, Minlie
Duane, Nan
Chen, Weizhu
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9248 - 9274
[33] Toward Robust RALMs: Revealing the Impact of Imperfect Retrieval on Retrieval-Augmented Language Models
Park, Seong-Il
Lee, Jay-Yoon
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1686 - 1702
[34] Revisiting and Improving Retrieval-Augmented Deep Assertion Generation
Sun, Weifeng
Li, Hongyan
Yan, Meng
Lei, Yan
Zhang, Hongyu
2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1123 - 1135
[35] Retrieval-Augmented Few-shot Text Classification
Yu, Guoxin
Liu, Lemao
Jiang, Haiyun
Shi, Shuming
Ao, Xiang
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6721 - 6735
[36] Web Application for Retrieval-Augmented Generation: Implementation and Testing
Radeva, Irina
Popchev, Ivan
Doukovska, Lyubka
Dimitrova, Miroslava
ELECTRONICS, 2024, 13 (07)
[37] Performance Evaluation of Vector Embeddings with Retrieval-Augmented Generation
Kukreja, Sanjay
Kumar, Tarun
Bharate, Vishal
Purohit, Amit
Dasgupta, Abhijit
Guha, Debashis
2024 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS, ICCCS 2024, 2024, : 333 - 340
[38] ReadsRE: Retrieval-Augmented Distantly Supervised Relation Extraction
Zhang, Yue
Fei, Hongliang
Li, Ping
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2257 - 2262
[39] Learning Customized Visual Models with Retrieval-Augmented Knowledge
Liu, Haotian
Son, Kilho
Yang, Jianwei
Liu, Ce
Gao, Jianfeng
Lee, Yong Jae
Li, Chunyuan
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15148 - 15158
[40] Benchmarking Large Language Models in Retrieval-Augmented Generation
Chen, Jiawei
Lin, Hongyu
Han, Xianpei
Sun, Le
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17754 - 17762

← 1 2 3 4 5 →