Learning to Embed Semantic Similarity for Joint Image-Text Retrieval

被引：6

作者：

Malali, Noam ^{[1
]}

Keller, Yosi ^{[1
]}

机构：

[1] Bar Ilan Univ, Fac Engn, IL-5290002 Ramat Gan, Israel

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 12期

关键词：

Text and image fusion; deep learning; joint embedding;

D O I：

10.1109/TPAMI.2021.3132163

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a deep learning approach for learning the joint semantic embeddings of images and captions in a euclidean space, such that the semantic similarity is approximated by the L-2 distances in the embedding space. For that, we introduce a metric learning scheme that utilizes multitask learning to learn the embedding of identical semantic concepts using a center loss. By introducing a differentiable quantization scheme into the end-to-end trainable network, we derive a semantic embedding of semantically similar concepts in euclidean space. We also propose a novel metric learning formulation using an adaptive margin hinge loss, that is refined during the training phase. The proposed scheme was applied to the MS-COCO, Flicke30K and Flickr8K datasets, and was shown to compare favorably with contemporary state-of-the-art approaches.

引用

页码：10252 / 10260

页数：9

共 50 条

[31] Visual context learning based on textual knowledge for image-text retrieval
Qin, Yuzhuo
Gu, Xiaodong
Tan, Zhenshan
NEURAL NETWORKS, 2022, 152 : 434 - 449
[32] Scene Text Retrieval via Joint Text Detection and Similarity Learning
Wang, Hao
Bai, Xiang
Yang, Mingkun
Zhu, Shenggao
Wang, Jing
Liu, Wenyu
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4556 - 4565
[33] Review of Recent Deep Learning Based Methods for Image-Text Retrieval
Chen, Jianan
Zhang, Lu
Bai, Cong
Kpalma, Kidiyo
THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2020), 2020, : 171 - 176
[34] Image-Text Embedding Learning via Visual and Textual Semantic Reasoning
Li, Kunpeng
Zhang, Yulun
Li, Kai
Li, Yuanyuan
Fu, Yun
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 641 - 656
[35] Learning Image-Text Associations
Jiang, Tao
Tan, Ah-Hwee
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (02) : 161 - 177
[36] Learning Dual Semantic Relations With Graph Attention for Image-Text Matching
Wen, Keyu
Gu, Xiaodong
Cheng, Qingrong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (07) : 2866 - 2879
[37] Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval
Mithun, Niluthpol Chowdhury
Panda, Rameswar
Papalexakis, Evangelos E.
Roy-Chowdhury, Amit K.
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 1856 - 1864
[38] HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval
Wang, Shuhuai
Liu, Zheng
Pei, Xinlei
Xu, Junhao
SENSORS, 2023, 23 (05)
[39] Regularizing Visual Semantic Embedding With Contrastive Learning for Image-Text Matching
Liu, Yang
Liu, Hong
Wang, Huaqiu
Liu, Mengyuan
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1332 - 1336
[40] Similarity Reasoning and Filtration for Image-Text Matching
Diao, Haiwen
Zhang, Ying
Ma, Lin
Lu, Huchuan
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1218 - 1226

← 1 2 3 4 5 →