Deep attentional fine-grained similarity network with adversarial learning for cross-modal retrieval

被引:5
|
作者
Cheng, Qingrong [1 ]
Gu, Xiaodong [1 ]
机构
[1] Fudan Univ, Dept Elect Engn, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
Attention mechanism; Cross-modal retrieval; Bidirectional LSTM; Fine-grained similarity;
D O I
10.1007/s11042-020-09450-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
People have witnessed the swift development of multimedia devices and multimedia technologies in recent years. How to catch interesting and highly relevant information from the magnanimous multimedia data becomes an urgent and challenging matter. To obtain more accurate retrieval results, researchers naturally think of using more fine-grained features to evaluate the similarity among multimedia samples. In this paper, we propose aDeep Attentional Fine-grained Similarity Network(DAFSN) for cross-modal retrieval, which is optimized in an adversarial learning manner. The DAFSN model consists of two subnetworks, attentional fine-grained similarity network for aligned representation learning and modal discriminative network. The front subnetwork adopts Bi-directional Long Short-Term Memory (LSTM) and pre-trained Inception-v3 model to extract text features and image features. In aligned representation learning, we consider not only the sentence-level pair-matching constraint but also the fine-grained similarity between word-level features of text description and sub-regional features of an image. The modal discriminative network aims to minimize the "heterogeneity gap" between text features and image features in an adversarial manner. We do experiments on several widely used datasets to verify the performance of the proposed DAFSN. The experimental results show that the DAFSN obtains better retrieval results based on the MAP metric. Besides, the result analyses and visual comparisons are presented in the experimental section.
引用
收藏
页码:31401 / 31428
页数:28
相关论文
共 50 条
  • [1] Deep attentional fine-grained similarity network with adversarial learning for cross-modal retrieval
    Qingrong Cheng
    Xiaodong Gu
    Multimedia Tools and Applications, 2020, 79 : 31401 - 31428
  • [2] Deep cross-modal hashing with fine-grained similarity
    Yangdong Chen
    Jiaqi Quan
    Yuejie Zhang
    Rui Feng
    Tao Zhang
    Applied Intelligence, 2023, 53 : 28954 - 28973
  • [3] Fine-grained similarity semantic preserving deep hashing for cross-modal retrieval
    Li, Guoyou
    Peng, Qingjun
    Zou, Dexu
    Yang, Jinyue
    Shu, Zhenqiu
    FRONTIERS IN PHYSICS, 2023, 11
  • [4] Deep cross-modal hashing with fine-grained similarity
    Chen, Yangdong
    Quan, Jiaqi
    Zhang, Yuejie
    Feng, Rui
    Zhang, Tao
    APPLIED INTELLIGENCE, 2023, 53 (23) : 28954 - 28973
  • [5] Multi-label adversarial fine-grained cross-modal retrieval
    Sun, Chunpu
    Zhang, Huaxiang
    Liu, Li
    Liu, Dongmei
    Wang, Lin
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 117
  • [6] Deep Self-Supervised Hashing With Fine-Grained Similarity Mining for Cross-Modal Retrieval
    Han, Lijun
    Wang, Renlin
    Chen, Chunlei
    Zhang, Huihui
    Zhang, Yujie
    Zhang, Wenfeng
    IEEE ACCESS, 2024, 12 : 31756 - 31770
  • [7] Fine-Grained Label Learning via Siamese Network for Cross-modal Information Retrieval
    Xu, Yiming
    Yu, Jing
    Guo, Jingjing
    Hu, Yue
    Tan, Jianlong
    COMPUTATIONAL SCIENCE - ICCS 2019, PT II, 2019, 11537 : 304 - 317
  • [8] Deep Multiscale Fine-Grained Hashing for Remote Sensing Cross-Modal Retrieval
    Huang, Jiaxiang
    Feng, Yong
    Zhou, Mingliang
    Xiong, Xiancai
    Wang, Yongheng
    Qiang, Baohua
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [9] Fine-Grained Matching with Multi-Perspective Similarity Modeling for Cross-Modal Retrieval
    Xie, Xiumin
    Hou, Chuanwen
    Li, Zhixin
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 2148 - 2158
  • [10] Fine-grained Cross-modal Alignment Network for Text-Video Retrieval
    Han, Ning
    Chen, Jingjing
    Xiao, Guangyi
    Zhang, Hao
    Zeng, Yawen
    Chen, Hao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3826 - 3834