Crossing the Gap: Domain Generalization for Image Captioning

被引:3
|
作者
Ren, Yuchen [1 ,2 ]
Mao, Zhendong [1 ,3 ]
Fang, Shancheng [1 ]
Lu, Yan [2 ]
He, Tong [2 ]
Du, Hao [1 ]
Zhang, Yongdong [1 ,3 ]
Ouyang, Wanli [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[3] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00281
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing image captioning methods are under the assumption that the training and testing data are from the same domain or that the data from the target domain (i.e., the domain that testing data lie in) are accessible. However, this assumption is invalid in real-world applications where the data from the target domain is inaccessible. In this paper, we introduce a new setting called Domain Generalization for Image Captioning (DGIC), where the data from the target domain is unseen in the learning process. We first construct a benchmark dataset for DGIC, which helps us to investigate models' domain generalization (DG) ability on unseen domains. With the support of the new benchmark, we further propose a new framework called language-guided semantic metric learning (LSML) for the DGIC setting. Experiments on multiple datasets demonstrate the challenge of the task and the effectiveness of our newly proposed benchmark and LSML framework.
引用
收藏
页码:2871 / 2880
页数:10
相关论文
共 50 条
  • [21] Cross-Domain Infrared Image Classification via Image-to-Image Translation and Deep Domain Generalization
    Guo, Zhao-Rui
    Niu, Jia-Wei
    Liu, Zhun-Ga
    2022 17TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2022, : 487 - 493
  • [22] Addressing Performance Inconsistency in Domain Generalization for Image Classification
    Stirling, Jamie
    Al Moubayed, Noura
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [23] Domain generalization for mammographic image analysis with contrastive learning
    Li, Zheren
    Cui, Zhiming
    Zhang, Lichi
    Wang, Sheng
    Lei, Chenjin
    Ouyang, Xi
    Chen, Dongdong
    Zhao, Xiangyu
    Liu, Chunling
    Liu, Zaiyi
    Gu, Yajia
    Shen, Dinggang
    Cheng, Jie-Zhi
    Computers in Biology and Medicine, 2025, 185
  • [24] FOOLED BY IMAGINATION: ADVERSARIAL ATTACK TO IMAGE CAPTIONING VIA PERTURBATION IN COMPLEX DOMAIN
    Zhang, Shaofeng
    Wang, Zheng
    Xu, Xing
    Guan, Xiang
    Yang, Yang
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [25] Coastal Image Captioning
    Yang, Qiaoqiao
    Wang, Guangxing
    Zhang, Xiaoyu
    Grecos, Christos
    Ren, Peng
    JOURNAL OF COASTAL RESEARCH, 2020, : 145 - 150
  • [26] Convolutional Image Captioning
    Aneja, Jyoti
    Deshpande, Aditya
    Schwing, Alexander G.
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5561 - 5570
  • [27] Unsupervised Image Captioning
    Feng, Yang
    Ma, Lin
    Liu, Wei
    Luo, Jiebo
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4120 - 4129
  • [28] COLLOQUIAL IMAGE CAPTIONING
    Ge, Xuri
    Chen, Fuhai
    Shen, Chen
    Ji, Rongrong
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 356 - 361
  • [29] Automated image captioning
    Puscasiu, Adela
    Fanca, Alexandra
    Gota, Dan-Ioan
    Valean, Honoriu
    PROCEEDINGS OF 2020 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS (AQTR), 2020, : 361 - 366
  • [30] Automatic image captioning
    Pan, JY
    Yang, HJ
    Duygulu, P
    Faloutsos, C
    2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1987 - 1990