The Style Transformer with Common Knowledge Optimization for Image-Text Retrieval

被引：2

作者：

Li W. ^{[1
]}

Ma Z. ^{[2
]}

Shi J. ^{[3
]}

Fan X. ^{[1
]}

机构：

[1] Harbin Institute of Technology, Department of Computer Science and Technology, Harbin

[2] Peng Cheng Laboratory, Shenzhen

[3] Beijing University of Posts and Telecommunications, School of Cyberspace Security, Beijing

来源：

IEEE Signal Processing Letters | 2023年 / 30卷

关键词：

Image-text retrieval; transformer;

D O I：

10.1109/LSP.2023.3310870

中图分类号：

学科分类号：

摘要：

Image-text retrieval which associates different modalities has drawn broad attention due to its excellent research value and broad real-world application. However, most of the existing methods haven't taken the high-level semantic relationships ('style embedding') and common knowledge from multi-modalities into full consideration. To this end, we introduce a novel style transformer network with common knowledge optimization (CKSTN) for image-text retrieval. The main module is the common knowledge adaptor (CKA) with both the style embedding extractor (SEE) and the common knowledge optimization (CKO) modules. Specifically, the SEE uses the sequential update strategy to effectively connect the features of different stages in SEE. The CKO module is introduced to dynamically capture the latent concepts of common knowledge from different modalities. Besides, to get generalized temporal common knowledge, we propose a sequential update strategy to effectively integrate the features of different layers in SEE with previous common feature units. CKSTN demonstrates the superiorities of the state-of-the-art methods in image-text retrieval on MSCOCO and Flickr30 K datasets. Moreover, CKSTN is constructed based on the lightweight transformer which is more convenient and practical for the application of real scenes, due to the better performance and lower parameters. © 1994-2012 IEEE.

引用

页码：1197 / 1201

页数：4

共 50 条

[11] Compositional Learning of Image-Text Query for Image Retrieval
Anwaar, Muhammad Umer
Labintcev, Egor
Kleinsteuber, Martin
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1139 - 1148
[12] Kernel triplet loss for image-text retrieval
Pan, Zhengxin
Wu, Fangyu
Zhang, Bailing
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
[13] Dynamic Contrastive Distillation for Image-Text Retrieval
Rao, Jun
Ding, Liang
Qi, Shuhan
Fang, Meng
Liu, Yang
Shen, Li
Tao, Dacheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8383 - 8395
[14] Semantic Completion and Filtration for Image-Text Retrieval
Yang, Song
Li, Qiang
Li, Wenhui
Li, Xuan-Ya
Jin, Ran
Lv, Bo
Wang, Rui
Liu, Anan
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (04)
[15] Spatial-Channel Attention Transformer With Pseudo Regions for Remote Sensing Image-Text Retrieval
Wu, Dongqing
Li, Huihui
Hou, Yinxuan
Xu, Cuili
Cheng, Gong
Guo, Lei
Liu, Hang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
[16] Image-text fusion transformer network for sarcasm detection
Liu, Jing
Tian, Shengwei
Yu, Long
Shi, Xianwei
Wang, Fan
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (14) : 41895 - 41909
[17] MKVSE: Multimodal Knowledge Enhanced Visual-semantic Embedding for Image-text Retrieval
Feng, Duoduo
He, Xiangteng
Peng, Yuxin
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (05)
[18] Multi-level knowledge-driven feature representation and triplet loss optimization network for image-text retrieval
Qin, Xueyang
Li, Lishang
Hao, Fei
Ge, Meiling
Pang, Guangyao
INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (01)
[19] Image-text fusion transformer network for sarcasm detection
Jing Liu
Shengwei Tian
Long Yu
Xianwei Shi
Fan Wang
Multimedia Tools and Applications, 2024, 83 : 41895 - 41909
[20] Dynamic Modality Interaction Modeling for Image-Text Retrieval
Qu, Leigang
Liu, Meng
Wu, Jianlong
Gao, Zan
Nie, Liqiang
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1104 - 1113

← 1 2 3 4 5 →