Scene graph fusion and negative sample generation strategy for image-text matching

被引:0
|
作者
Wang, Liqin [1 ,2 ,3 ]
Yang, Pengcheng [1 ]
Wang, Xu [1 ,2 ,3 ]
Xu, Zhihong [1 ,2 ,3 ]
Dong, Yongfeng [1 ,2 ,3 ]
机构
[1] Hebei Univ Technol, Sch Artificial Intelligence & Data Sci, Tianjin 300401, Peoples R China
[2] Hebei Prov Key Lab Big Data Calculat, Tianjin 300401, Peoples R China
[3] Hebei Data Driven Ind Intelligent Engn Res Ctr, Tianjin 300401, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 01期
关键词
Image-text matching; Scene graph fusion; Explicit modeling; Negative sample;
D O I
10.1007/s11227-024-06652-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the field of image-text matching, the scene graph-based approach is commonly employed to detect semantic associations between entities in cross-modal information, hence improving cross-modal interaction by capturing more fine-grained associations. However, the associations between images and texts are often implicitly modeled, resulting in a semantic gap between image and text information. To address the lack of cross-modal information integration and explicitly model fine-grained semantic information in images and texts, we propose a scene graph fusion and negative sample generation strategy for image-text matching(SGFNS). Furthermore, to enhance the expression ability of the insignificant features of similar images in image-text matching, we propose a negative sample generation strategy, and introduce an extra loss function to effectively incorporate negative samples to enhance the training process. In experiments, we verify the effectiveness of our model compared with current state-of-the-art models using scene graph directly.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] Stacked Cross Attention for Image-Text Matching
    Lee, Kuang-Huei
    Chen, Xi
    Hua, Gang
    Hu, Houdong
    He, Xiaodong
    COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 212 - 228
  • [32] Multi-scale image-text matching network for scene and spatio-temporal images
    Yu, Runde
    Jin, Fusheng
    Qiao, Zhuang
    Yuan, Ye
    Wang, Guoren
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 142 : 292 - 300
  • [33] Giving Text More Imagination Space for Image-text Matching
    Dong, Xinfeng
    Han, Longfei
    Zhang, Dingwen
    Liu, Li
    Han, Junwei
    Zhang, Huaxiang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6359 - 6368
  • [34] Image-text dual neural network with decision strategy for small-sample image classification
    Zhu, Fangyi
    Ma, Zhanyu
    Li, Xiaoxu
    Chen, Guang
    Chien, Jen-Tzung
    Xue, Jing-Hao
    Guo, Jun
    NEUROCOMPUTING, 2019, 328 : 182 - 188
  • [35] Graph Interpretation of Image-Text Matching: Link Prediction on Concept-Enhanced Cross-Modal Graph
    Fan, Zhihao
    Li, Zejun
    Wang, Siyuan
    Wei, Zhongyu
    Shan, Haijun
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 446 - 457
  • [36] Learning Relationship-Enhanced Semantic Graph for Fine-Grained Image-Text Matching
    Liu, Xin
    He, Yi
    Cheung, Yiu-Ming
    Xu, Xing
    Wang, Nannan
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (02) : 948 - 961
  • [37] Scene Graph Driven Text-Prompt Generation for Image Inpainting
    Shukla, Tripti
    Maheshwari, Paridhi
    Singh, Rajhans
    Shukla, Ankita
    Kulkarni, Kuldeep
    Turaga, Pavan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, : 759 - 768
  • [38] Scene Video Text Tracking With Graph Matching
    Pei, Wei-Yi
    Yang, Chun
    Meng, Li-Yu
    Hou, Jie-Bo
    Tian, Shu
    Yin, Xu-Cheng
    IEEE ACCESS, 2018, 6 : 19419 - 19426
  • [39] Hashing based Efficient Inference for Image-Text Matching
    Tu, Rong-Cheng
    Ji, Lei
    Luo, Huaishao
    Shi, Botian
    Huang, Heyan
    Duan, Nan
    Mao, Xian-Ling
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 743 - 752
  • [40] Towards Deconfounded Image-Text Matching with Causal Inference
    Li, Wenhui
    Su, Xinqi
    Song, Dan
    Wang, Lanjun
    Zhang, Kun
    Liu, An-An
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6264 - 6273