Learning to Embed Semantic Similarity for Joint Image-Text Retrieval

被引：6

作者：

Malali, Noam ^{[1
]}

Keller, Yosi ^{[1
]}

机构：

[1] Bar Ilan Univ, Fac Engn, IL-5290002 Ramat Gan, Israel

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 12期

关键词：

Text and image fusion; deep learning; joint embedding;

D O I：

10.1109/TPAMI.2021.3132163

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a deep learning approach for learning the joint semantic embeddings of images and captions in a euclidean space, such that the semantic similarity is approximated by the L-2 distances in the embedding space. For that, we introduce a metric learning scheme that utilizes multitask learning to learn the embedding of identical semantic concepts using a center loss. By introducing a differentiable quantization scheme into the end-to-end trainable network, we derive a semantic embedding of semantically similar concepts in euclidean space. We also propose a novel metric learning formulation using an adaptive margin hinge loss, that is refined during the training phase. The proposed scheme was applied to the MS-COCO, Flicke30K and Flickr8K datasets, and was shown to compare favorably with contemporary state-of-the-art approaches.

引用

页码：10252 / 10260

页数：9

共 50 条

[21] Cross-modal Image-Text Retrieval with Multitask Learning
Luo, Junyu
Shen, Ying
Ao, Xiang
Zhao, Zhou
Yang, Min
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2309 - 2312
[22] Joint feature approach for image-text cross-modal retrieval
Gao, Dihui
Sheng, Lijie
Xu, Xiaodong
Miao, Qiguang
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2024, 51 (04): : 128 - 138
[23] Joint Intra & Inter-Grained Reasoning: A New Look Into Semantic Consistency of Image-Text Retrieval
Pan, Renjie
Yang, Hua
Li, Cunyan
Yang, Jinhai
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4912 - 4925
[24] Context-aware relation enhancement and similarity reasoning for image-text retrieval
Cui, Zheng
Hu, Yongli
Sun, Yanfeng
Yin, Baocai
IET COMPUTER VISION, 2024, 18 (05) : 652 - 665
[25] Multi-Task Visual Semantic Embedding Network for Image-Text Retrieval
Qin, Xue-Yang
Li, Li-Shuang
Tang, Jing-Yao
Hao, Fei
Ge, Mei-Ling
Pang, Guang-Yao
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2024, 39 (04) : 811 - 826
[26] SAM: cross-modal semantic alignments module for image-text retrieval
Pilseo Park
Soojin Jang
Yunsung Cho
Youngbin Kim
Multimedia Tools and Applications, 2024, 83 : 12363 - 12377
[27] SAM: cross-modal semantic alignments module for image-text retrieval
Park, Pilseo
Jang, Soojin
Cho, Yunsung
Kim, Youngbin
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 12363 - 12377
[28] Multi-view and region reasoning semantic enhancement for image-text retrieval
Cheng, Wengang
Han, Ziyi
He, Di
Wu, Lifang
MULTIMEDIA SYSTEMS, 2024, 30 (04)
[29] Entity Semantic Feature Fusion Network for Remote Sensing Image-Text Retrieval
Shui, Jianan
Ding, Shuaipeng
Li, Mingyong
Ma, Yan
WEB AND BIG DATA, APWEB-WAIM 2024, PT V, 2024, 14965 : 130 - 145
[30] JECL: Joint Embedding and Cluster Learning for Image-Text Pairs
Yang, Sean T.
Huang, Kuan-Hao
Howe, Bill
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8344 - 8351

← 1 2 3 4 5 →