Scene graph fusion and negative sample generation strategy for image-text matching

被引：0

作者：

Wang, Liqin ^{[1
,2
,3
]}

Yang, Pengcheng ^{[1
]}

Wang, Xu ^{[1
,2
,3
]}

Xu, Zhihong ^{[1
,2
,3
]}

Dong, Yongfeng ^{[1
,2
,3
]}

机构：

[1] Hebei Univ Technol, Sch Artificial Intelligence & Data Sci, Tianjin 300401, Peoples R China

[2] Hebei Prov Key Lab Big Data Calculat, Tianjin 300401, Peoples R China

[3] Hebei Data Driven Ind Intelligent Engn Res Ctr, Tianjin 300401, Peoples R China

来源：

JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 01期

关键词：

Image-text matching; Scene graph fusion; Explicit modeling; Negative sample;

D O I：

10.1007/s11227-024-06652-2

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the field of image-text matching, the scene graph-based approach is commonly employed to detect semantic associations between entities in cross-modal information, hence improving cross-modal interaction by capturing more fine-grained associations. However, the associations between images and texts are often implicitly modeled, resulting in a semantic gap between image and text information. To address the lack of cross-modal information integration and explicitly model fine-grained semantic information in images and texts, we propose a scene graph fusion and negative sample generation strategy for image-text matching(SGFNS). Furthermore, to enhance the expression ability of the insignificant features of similar images in image-text matching, we propose a negative sample generation strategy, and introduce an extra loss function to effectively incorporate negative samples to enhance the training process. In experiments, we verify the effectiveness of our model compared with current state-of-the-art models using scene graph directly.

引用

页数：22

共 50 条

[31] Stacked Cross Attention for Image-Text Matching
Lee, Kuang-Huei
Chen, Xi
Hua, Gang
Hu, Houdong
He, Xiaodong
COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 212 - 228
[32] Multi-scale image-text matching network for scene and spatio-temporal images
Yu, Runde
Jin, Fusheng
Qiao, Zhuang
Yuan, Ye
Wang, Guoren
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 142 : 292 - 300
[33] Giving Text More Imagination Space for Image-text Matching
Dong, Xinfeng
Han, Longfei
Zhang, Dingwen
Liu, Li
Han, Junwei
Zhang, Huaxiang
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6359 - 6368
[34] Image-text dual neural network with decision strategy for small-sample image classification
Zhu, Fangyi
Ma, Zhanyu
Li, Xiaoxu
Chen, Guang
Chien, Jen-Tzung
Xue, Jing-Hao
Guo, Jun
NEUROCOMPUTING, 2019, 328 : 182 - 188
[35] Graph Interpretation of Image-Text Matching: Link Prediction on Concept-Enhanced Cross-Modal Graph
Fan, Zhihao
Li, Zejun
Wang, Siyuan
Wei, Zhongyu
Shan, Haijun
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 446 - 457
[36] Learning Relationship-Enhanced Semantic Graph for Fine-Grained Image-Text Matching
Liu, Xin
He, Yi
Cheung, Yiu-Ming
Xu, Xing
Wang, Nannan
IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (02) : 948 - 961
[37] Scene Graph Driven Text-Prompt Generation for Image Inpainting
Shukla, Tripti
Maheshwari, Paridhi
Singh, Rajhans
Shukla, Ankita
Kulkarni, Kuldeep
Turaga, Pavan
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, : 759 - 768
[38] Scene Video Text Tracking With Graph Matching
Pei, Wei-Yi
Yang, Chun
Meng, Li-Yu
Hou, Jie-Bo
Tian, Shu
Yin, Xu-Cheng
IEEE ACCESS, 2018, 6 : 19419 - 19426
[39] Hashing based Efficient Inference for Image-Text Matching
Tu, Rong-Cheng
Ji, Lei
Luo, Huaishao
Shi, Botian
Huang, Heyan
Duan, Nan
Mao, Xian-Ling
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 743 - 752
[40] Towards Deconfounded Image-Text Matching with Causal Inference
Li, Wenhui
Su, Xinqi
Song, Dan
Wang, Lanjun
Zhang, Kun
Liu, An-An
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6264 - 6273

← 1 2 3 4 5 →