Scene graph fusion and negative sample generation strategy for image-text matching

被引:0
|
作者
Wang, Liqin [1 ,2 ,3 ]
Yang, Pengcheng [1 ]
Wang, Xu [1 ,2 ,3 ]
Xu, Zhihong [1 ,2 ,3 ]
Dong, Yongfeng [1 ,2 ,3 ]
机构
[1] Hebei Univ Technol, Sch Artificial Intelligence & Data Sci, Tianjin 300401, Peoples R China
[2] Hebei Prov Key Lab Big Data Calculat, Tianjin 300401, Peoples R China
[3] Hebei Data Driven Ind Intelligent Engn Res Ctr, Tianjin 300401, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 01期
关键词
Image-text matching; Scene graph fusion; Explicit modeling; Negative sample;
D O I
10.1007/s11227-024-06652-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the field of image-text matching, the scene graph-based approach is commonly employed to detect semantic associations between entities in cross-modal information, hence improving cross-modal interaction by capturing more fine-grained associations. However, the associations between images and texts are often implicitly modeled, resulting in a semantic gap between image and text information. To address the lack of cross-modal information integration and explicitly model fine-grained semantic information in images and texts, we propose a scene graph fusion and negative sample generation strategy for image-text matching(SGFNS). Furthermore, to enhance the expression ability of the insignificant features of similar images in image-text matching, we propose a negative sample generation strategy, and introduce an extra loss function to effectively incorporate negative samples to enhance the training process. In experiments, we verify the effectiveness of our model compared with current state-of-the-art models using scene graph directly.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Team HUGE: Image-Text Matching via Hierarchical and Unified Graph Enhancing
    Li, Bo
    Wu, You
    Li, Zhixin
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 704 - 712
  • [22] Heterogeneous Graph Fusion Network for cross-modal image-text retrieval
    Qin, Xueyang
    Li, Lishuang
    Pang, Guangyao
    Hao, Fei
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [23] Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching
    Li, Zheng
    Guo, Caili
    Wang, Xin
    Feng, Zerun
    Du, Zhongtian
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1921 - 1935
  • [24] Similarity Reasoning and Filtration for Image-Text Matching
    Diao, Haiwen
    Zhang, Ying
    Ma, Lin
    Lu, Huchuan
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1218 - 1226
  • [25] GraDual: Graph-based Dual-modal Representation for Image-Text Matching
    Long, Siqu
    Han, Soyeon Caren
    Wan, Xiaojun
    Poon, Josiah
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2463 - 2472
  • [26] Asymmetric Polysemous Reasoning for Image-Text Matching
    Zhang, Hongping
    Yang, Ming
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1013 - 1022
  • [27] Visual Semantic Reasoning for Image-Text Matching
    Li, Kunpeng
    Zhang, Yulun
    Li, Kai
    Li, Yuanyuan
    Fu, Yun
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4653 - 4661
  • [28] Hierarchical Knowledge-Based Graph Embedding Model for Image-Text Matching in IoTs
    Zhang, Lizong
    Li, Meng
    Yan, Ke
    Wang, Ruozhou
    Hui, Bei
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (12) : 9399 - 9409
  • [29] GADNet: Improving image-text matching via graph-based aggregation and disentanglement
    Pu, Xiao
    Wang, Zhiwen
    Yuan, Lin
    Wu, Yu
    Jing, Liping
    Gao, Xinbo
    PATTERN RECOGNITION, 2025, 157
  • [30] IMAGE-TEXT MATCHING WITH SHARED SEMANTIC CONCEPTS
    Miao Lanxin
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,