Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning

被引:0
|
作者
Li, Wenyan [1 ]
Li, Jiaang [1 ]
Ramose, Rita [2 ]
Tang, Raphael [3 ]
Elliott, Desmond [1 ]
机构
[1] Univ Copenhagen, Dept Comp Sci, Copenhagen, Denmark
[2] Univ Lisbon, Inst Super Tecn, NESC ID, Lisbon, Portugal
[3] Comcast Appl AI, Philadelphia, PA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in retrieval-augmented models for image captioning highlight the benefit of retrieving related captions for efficient, lightweight models with strong domain-transfer capabilities. While these models demonstrate the success of retrieval augmentation, retrieval models are still far from perfect in practice: the retrieved information can sometimes mislead the model, resulting in incorrect generation and worse performance. In this paper, we analyze the robustness of a retrieval-augmented captioning model SMALLCAP. Our analysis shows that the model is sensitive to tokens that appear in the majority of the retrieved captions, and the input attribution shows that those tokens are likely copied into the generated output. Given these findings, we propose to train the model by sampling retrieved captions from more diverse sets. This decreases the chance that the model learns to copy majority tokens, and improves both in-domain and cross-domain performance.
引用
收藏
页码:9285 / 9299
页数:15
相关论文
共 50 条
  • [31] Diversify Question Generation with Retrieval-Augmented Style Transfer
    Gou, Qi
    Xia, Zehua
    Yu, Bowen
    Yu, Haiyang
    Huang, Fei
    Li, Yongbin
    Nguyen, Cam-Tu
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1677 - 1690
  • [32] Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
    Shaol, Zhihong
    Gong, Yeyun
    Shen, Yelong
    Huang, Minlie
    Duane, Nan
    Chen, Weizhu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9248 - 9274
  • [33] Toward Robust RALMs: Revealing the Impact of Imperfect Retrieval on Retrieval-Augmented Language Models
    Park, Seong-Il
    Lee, Jay-Yoon
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1686 - 1702
  • [34] Revisiting and Improving Retrieval-Augmented Deep Assertion Generation
    Sun, Weifeng
    Li, Hongyan
    Yan, Meng
    Lei, Yan
    Zhang, Hongyu
    2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1123 - 1135
  • [35] Retrieval-Augmented Few-shot Text Classification
    Yu, Guoxin
    Liu, Lemao
    Jiang, Haiyun
    Shi, Shuming
    Ao, Xiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6721 - 6735
  • [36] Web Application for Retrieval-Augmented Generation: Implementation and Testing
    Radeva, Irina
    Popchev, Ivan
    Doukovska, Lyubka
    Dimitrova, Miroslava
    ELECTRONICS, 2024, 13 (07)
  • [37] Performance Evaluation of Vector Embeddings with Retrieval-Augmented Generation
    Kukreja, Sanjay
    Kumar, Tarun
    Bharate, Vishal
    Purohit, Amit
    Dasgupta, Abhijit
    Guha, Debashis
    2024 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS, ICCCS 2024, 2024, : 333 - 340
  • [38] ReadsRE: Retrieval-Augmented Distantly Supervised Relation Extraction
    Zhang, Yue
    Fei, Hongliang
    Li, Ping
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2257 - 2262
  • [39] Learning Customized Visual Models with Retrieval-Augmented Knowledge
    Liu, Haotian
    Son, Kilho
    Yang, Jianwei
    Liu, Ce
    Gao, Jianfeng
    Lee, Yong Jae
    Li, Chunyuan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15148 - 15158
  • [40] Benchmarking Large Language Models in Retrieval-Augmented Generation
    Chen, Jiawei
    Lin, Hongyu
    Han, Xianpei
    Sun, Le
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17754 - 17762