CGNN: Caption-assisted graph neural network for image-text retrieval

被引：3

作者：

Hu, Yongli ^{[1
]}

Zhang, Hanfu ^{[1
]}

Jiang, Huajie ^{[1
,2
]}

Bi, Yandong ^{[1
]}

Yin, Baocai ^{[1
]}

机构：

[1] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

[2] Beijing Univ Technol, Beijing 100124, Peoples R China

来源：

PATTERN RECOGNITION LETTERS | 2022年 / 161卷

关键词：

Image -text retrieval; Cross -modal retrieval; Image captioning; Graph convolution;

D O I：

10.1016/j.patrec.2022.08.002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image-text retrieval has drawn much attention in recent years, where similarity measure between im-age and text plays an important role. Most existing works focus on learning global coarse-grained or local fine-grained features for similarity computation. However, the large domain gap between different modalities is often neglected, which makes it difficult to match the images and texts effectively. In order to deal with this problem, we propose to use auxiliary information to release the domain gap, where the image captions are generated. Then, a Caption-Assisted Graph Neural Network(CGNN) is designed to learn the structured relationships among images, captions, and texts. Since the captions and the texts are from the same domain, the domain gap between images and texts can be effectively released. With the help of caption information, our model achieves excellent performance on two cross-modal retrieval datasets, Flickr30K and MS-COCO, which shows the effectiveness of our framework.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：137 / 142

页数：6

共 50 条

[1] Image-text interaction graph neural network for image-text sentiment analysis
Wenxiong Liao
Bi Zeng
Jianqi Liu
Pengfei Wei
Jiongkun Fang
Applied Intelligence, 2022, 52 : 11184 - 11198
[2] Image-text interaction graph neural network for image-text sentiment analysis
Liao, Wenxiong
Zeng, Bi
Liu, Jianqi
Wei, Pengfei
Fang, Jiongkun
APPLIED INTELLIGENCE, 2022, 52 (10) : 11184 - 11198
[3] Scene Graph based Fusion Network for Image-Text Retrieval
Wang, Guoliang
Shang, Yanlei
Chen, Yong
Zhen, Chaoqi
Cheng, Dequan
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 138 - 143
[4] HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval
Guo, Jie
Wang, Meiting
Zhou, Yan
Song, Bin
Chi, Yuhao
Fan, Wei
Chang, Jianglong
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9189 - 9202
[5] Cross Attention Graph Matching Network for Image-Text Retrieval
Yang, Xiaoyu
Xie, Hao
Mao, Junyi
Wang, Zhiguo
Yin, Guangqiang
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 274 - 286
[6] Cross-modal Graph Matching Network for Image-text Retrieval
Cheng, Yuhao
Zhu, Xiaoguang
Qian, Jiuchao
Wen, Fei
Liu, Peilin
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (04)
[7] Flexible graph-based attention and pooling network for image-text retrieval
Sun, Hao
Qin, Xiaolin
Liu, Xiaojing
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (19) : 57895 - 57912
[8] Heterogeneous Graph Fusion Network for cross-modal image-text retrieval
Qin, Xueyang
Li, Lishuang
Pang, Guangyao
Hao, Fei
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
[9] Multimodal Knowledge Graph-guided Cross-Modal Graph Network for Image-text Retrieval
Zheng, Juncheng
Liang, Meiyu
Yu, Yang
Du, Junping
Xue, Zhe
2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024, 2024, : 97 - 100
[10] RELATION-GUIDED NETWORK FOR IMAGE-TEXT RETRIEVAL
Yang, Yulou
Shen, Hao
Yang, Ming
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1856 - 1860

← 1 2 3 4 5 →