Learning Scene Graph for Better Cross-Domain Image Captioning

被引:0
|
作者
Jia, Junhua [1 ]
Xin, Xiaowei [1 ]
Gao, Xiaoyan [1 ]
Ding, Xiangqian [1 ]
Pang, Shunpeng [2 ]
机构
[1] Ocean Univ China, Fac Informat Sci & Engn, Shandong 266000, Peoples R China
[2] Weifang Univ, Sch Comp Engn, Shandong 261061, Peoples R China
关键词
Image Captioning; Scene Graph; Text-to-Image Synthesis; Dual Learning;
D O I
10.1007/978-981-99-8435-0_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The current image captioning (IC) methods achieve good results within a single domain primarily due to training on a large amount of annotated data. However, the performance of single-domain image captioning methods suffers when extended to new domains. To address this, we propose a cross-domain image captioning framework, called SGCDIC, which achieves cross-domain generalization of image captioning models by simultaneously optimizing two coupled tasks, i.e., image captioning and text-to-image synthesis (TIS). Specifically, we propose a scene-graph-based approach SGAT for image captioning tasks. The image synthesis task employs a GAN variant (DFGAN) to synthesize plausible images based on the generated text descriptions by SGAT. We compare the generated images with the real images to enhance the image captioning performance in new domains. We conduct extensive experiments to evaluate the performance of SGCDIC by using the MSCOCO as the source domain data, and using Flickr30k and Oxford-102 as the new domain data. Sufficient comparative experiments and ablation studies demonstrate that SGCDIC achieves substantially better performance than the strong competitors for the cross-domain image captioning task.
引用
收藏
页码:121 / 137
页数:17
相关论文
共 50 条
  • [21] Dyadic Transfer Learning for Cross-Domain Image Classification
    Wang, Hua
    Nie, Feiping
    Huang, Heng
    Ding, Chris
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 551 - 556
  • [22] Cross-Domain Modality Fusion for Dense Video Captioning
    Aafaq N.
    Mian A.
    Liu W.
    Akhtar N.
    Shah M.
    IEEE Transactions on Artificial Intelligence, 2022, 3 (05): : 763 - 777
  • [23] FedCKE: Cross-Domain Knowledge Graph Embedding in Federated Learning
    Huang, Wei
    Liu, Jia
    Li, Tianrui
    Ji, Shenggong
    Wang, Dexian
    Huang, Tianqiang
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (03) : 792 - 804
  • [24] CROSS-DOMAIN HYPERSPECTRAL IMAGE CLASSIFICATION BASED ON GRAPH CONVOLUTIONAL NETWORKS
    Li, Yushan
    Ye, Minchao
    Qian, Yuntao
    Qian, Qipeng
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5974 - 5977
  • [25] Robust Label Propagation and Graph Embedding for Cross-Domain Image Classification
    Shi, Ke
    Wang, Jiancheng
    Yu, Chengjin
    Liang, Wuchang
    Wang, Wei
    Yan, Yuanting
    Zhang, Hua
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (04): : 3680 - 3688
  • [26] Image Captioning with Scene-graph Based Semantic Concepts
    Gao, Lizhao
    Wang, Bo
    Wang, Wenmin
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, : 225 - 229
  • [27] Improve Image Captioning by Modeling Dynamic Scene Graph Extension
    Geng, Minghao
    Zhao, Qingjie
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 398 - 406
  • [28] Cross-Domain Few-Shot Learning Based on Graph Convolution Contrast for Hyperspectral Image Classification
    Ye, Zhen
    Wang, Jie
    Sun, Tao
    Zhang, Jinxin
    Li, Wei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 14
  • [29] Cross-Domain Graph Anomaly Detection
    Ding, Kaize
    Shu, Kai
    Shan, Xuan
    Li, Jundong
    Liu, Huan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (06) : 2406 - 2415
  • [30] SAR IMAGE SCENE CLASSIFICATION AND OUT-OF-LIBRARY TARGET DETECTION WITH CROSS-DOMAIN ACTIVE TRANSFER LEARNING
    Geng, Zhe
    Li, Wei
    Xu, Ying
    Wang, Bei-Ning
    Zhu, Dai-Yin
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 7023 - 7026