Learning Scene Graph for Better Cross-Domain Image Captioning

被引:0
|
作者
Jia, Junhua [1 ]
Xin, Xiaowei [1 ]
Gao, Xiaoyan [1 ]
Ding, Xiangqian [1 ]
Pang, Shunpeng [2 ]
机构
[1] Ocean Univ China, Fac Informat Sci & Engn, Shandong 266000, Peoples R China
[2] Weifang Univ, Sch Comp Engn, Shandong 261061, Peoples R China
关键词
Image Captioning; Scene Graph; Text-to-Image Synthesis; Dual Learning;
D O I
10.1007/978-981-99-8435-0_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The current image captioning (IC) methods achieve good results within a single domain primarily due to training on a large amount of annotated data. However, the performance of single-domain image captioning methods suffers when extended to new domains. To address this, we propose a cross-domain image captioning framework, called SGCDIC, which achieves cross-domain generalization of image captioning models by simultaneously optimizing two coupled tasks, i.e., image captioning and text-to-image synthesis (TIS). Specifically, we propose a scene-graph-based approach SGAT for image captioning tasks. The image synthesis task employs a GAN variant (DFGAN) to synthesize plausible images based on the generated text descriptions by SGAT. We compare the generated images with the real images to enhance the image captioning performance in new domains. We conduct extensive experiments to evaluate the performance of SGCDIC by using the MSCOCO as the source domain data, and using Flickr30k and Oxford-102 as the new domain data. Sufficient comparative experiments and ablation studies demonstrate that SGCDIC achieves substantially better performance than the strong competitors for the cross-domain image captioning task.
引用
收藏
页码:121 / 137
页数:17
相关论文
共 50 条
  • [31] Feature selection for cross-scene hyperspectral image classification using cross-domain ReliefF
    Ye, Minchao
    Xu, Yongqiu
    Ji, Chenxi
    Chen, Hong
    Lu, Huijuan
    Qian, Yuntao
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2019, 17 (05)
  • [32] Cross-domain image description generation using transfer learning
    Kinghorn, Philip
    Zhang, Li
    DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 1462 - 1469
  • [33] Privacy-preserving Cross-domain Recommendation with Federated Graph Learning
    Tian, Changxin
    Xie, Yuexiang
    Chen, Xu
    Li, Yaliang
    Zhao, Xin
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (05)
  • [34] Heterogeneous Graph Embedding for Cross-Domain Recommendation Through Adversarial Learning
    Li, Jin
    Peng, Zhaohui
    Wang, Senzhang
    Xu, Xiaokang
    Yu, Philip S.
    Hao, Zhenyun
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT III, 2020, 12114 : 507 - 522
  • [35] A deep learning architecture for aligning cross-domain geographic knowledge graph
    Qiu, Qinjun
    Zheng, Shiyu
    Li, Jiali
    Tian, Miao
    Li, Zixuan
    Tao, Liufeng
    Zhu, Yunqiang
    Huang, Yi
    Chen, Zhanlong
    Xie, Zhong
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2025,
  • [36] Graph Disentangled Contrastive Learning with Personalized Transfer for Cross-Domain Recommendation
    Liu, Jing
    Sun, Lele
    Nie, Weizhi
    Jing, Peiguang
    Su, Yuting
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 8769 - 8777
  • [37] Heterogeneous graph contrastive learning for cold start cross-domain recommendation
    Xie, Yuanzhen
    Yu, Chenyun
    Jin, Xinzhou
    Cheng, Lei
    Hu, Bo
    Li, Zang
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [38] Deep Transfer Learning for Biology Cross-Domain Image Classification
    Guo, Chunfeng
    Wei, Bin
    Yu, Kun
    JOURNAL OF CONTROL SCIENCE AND ENGINEERING, 2021, 2021
  • [39] Social image annotation via cross-domain subspace learning
    Si, Si
    Tao, Dacheng
    Wang, Meng
    Chan, Kwok-Ping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 56 (01) : 91 - 108
  • [40] Learning to Learn With Variational Inference for Cross-Domain Image Classification
    Zhang, Lei
    Du, Yingjun
    Shen, Jiayi
    Zhen, Xiantong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3319 - 3328