Cross-media retrieval via fusing multi-modality and multi-grained data

被引:0
|
作者
Liu, Z. [1 ,2 ]
Yuan, S. [1 ,2 ]
Pei, X. [1 ,2 ]
Gao, S. [1 ,2 ]
Han, H. [1 ,2 ]
机构
[1] Shandong Univ Finance & Econ, Sch Comp Sci & Technol, Jinan 250014, Shandong, Peoples R China
[2] Shandong Univ Finance & Econ, Shandong Prov Key Lab Digital Media Technol, Jinan 250014, Shandong, Peoples R China
关键词
Cross-media retrieval; Multi-modality data; Multi-grained data; Multi-margin triplet loss; Margin-set;
D O I
10.24200/sci.2023.59834.6456
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Traditional cross-media retrieval methods mainly focus on coarse-grained data that reflect global characteristics while ignoring the fine-grained descriptions of local details. Meanwhile, traditional methods cannot accurately describe the correlations between the anchor and the irrelevant data. This paper aims to solve the abovementioned problems by proposing to fuse coarse-grained and fine-grained features and a multi-margin triplet loss based on a dual-framework. (1) Framework I: A multi-grained data fusion framework based on Deep Belief Network, and (2) Framework II: A multi-modality data fusion framework based on the multi-margin triplet loss function. In Framework I, the coarse-grained and fine-grained features fused by the joint Restricted Boltzmann Machine are input into Framework II. In Framework II, we innovatively propose the multi-margin triplet loss. The data, which belong to different modalities and semantic categories, are stepped away from the anchor in a multi-margin way. Experimental results show that the proposed method achieves better cross-media retrieval performance than other methods with different datasets. Furthermore, the ablation experiments verify that our proposed multi-grained fusion strategy and the multi-margin triplet loss function are effective. (c) 2023 Sharif University of Technology. All rights reserved.
引用
收藏
页码:1645 / 1669
页数:25
相关论文
共 50 条
  • [1] Measuring multi-modality similarities via subspace learning for cross-media retrieval
    Zhang, Hong
    Weng, Jianguang
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2006, PROCEEDINGS, 2006, 4261 : 979 - +
  • [2] CROSS-MEDIA TOPIC DETECTION: A MULTI-MODALITY FUSION FRAMEWORK
    Zhang, Yanyan
    Li, Guorong
    Chu, Lingyang
    Wang, Shuhui
    Zhang, Weigang
    Huang, Qingming
    2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,
  • [3] Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings
    Wang, Yue
    Li, Jing
    Lyu, Michael R.
    King, Irwin
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3311 - 3324
  • [4] Multi-grained Representation Learning for Cross-modal Retrieval
    Zhao, Shengwei
    Xu, Linhai
    Liu, Yuying
    Du, Shaoyi
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2194 - 2198
  • [5] Clinical Trial Retrieval via Multi-grained Similarity Learning
    Luo, Junyu
    Qian, Cheng
    Glass, Lucas
    Ma, Fenglong
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2950 - 2954
  • [6] Modality-Dependent Cross-Media Retrieval
    Wei, Yunchao
    Zhao, Yao
    Zhu, Zhenfeng
    Wei, Shikui
    Xiao, Yanhui
    Feng, Jiashi
    Yan, Shuicheng
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2016, 7 (04)
  • [7] Image retrieval with a multi-modality ontology
    Wang, Huan
    Liu, Song
    Chia, Liang-Tien
    MULTIMEDIA SYSTEMS, 2008, 13 (5-6) : 379 - 390
  • [8] Image retrieval with a multi-modality ontology
    Huan Wang
    Song Liu
    Liang-Tien Chia
    Multimedia Systems, 2008, 13 : 379 - 390
  • [9] CROSS-MODALITY CORRELATION PROPAGATION FOR CROSS-MEDIA RETRIEVAL
    Zhai, Xiaohua
    Peng, Yuxin
    Xiao, Jianguo
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2337 - 2340
  • [10] Multi-grained unsupervised evidence retrieval for question answering
    Hao You
    Neural Computing and Applications, 2023, 35 : 21247 - 21257