Combining Global and Local Similarity for Cross-Media Retrieval

被引:20
|
作者
Li, Zhixin [1 ]
Ling, Feng [1 ]
Zhang, Canlong [1 ]
Ma, Huifang [2 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[2] Northwest Normal Univ, Coll Comp Sci & Engn, Lanzhou 730070, Peoples R China
来源
IEEE ACCESS | 2020年 / 8卷 / 08期
基金
中国国家自然科学基金;
关键词
Convolutional neural network; self-attention network; attention mechanism; two-level network; cross-media retrieval;
D O I
10.1109/ACCESS.2020.2969808
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper mainly studies the problem of image-text matching in order to make image and text better match. Existing cross-media retrieval methods only make use of the information of image and part of text, that is, matching the whole image with the whole sentence, or matching some image areas with some words. In order to better reveal the potential connection between image and text semantics, this paper proposes a fusion of two levels of similarity across media images-text retrieval method, constructed the cross-media two-level network to explore the better matching between images and texts, it contains two subnets for dealing with global features and local characteristics. Specifically, in this method, the image is divided into the whole picture and some image area, the text is divided into the whole sentences and words, to study respectively, to explore the full potential alignment of images and text, and then use a two-level alignment framework is used to promote each other, fusion of two kinds of similarity can learn to complete representation of cross-media retrieval. Through the experimental evaluation on Flickr30K and MS-COCO datasets, the results show that the method in this paper can make the semantic matching of image and text more accurate, and is superior to the international popular cross-media retrieval method in various evaluation indexes.
引用
收藏
页码:21847 / 21856
页数:10
相关论文
共 50 条
  • [1] Cross-Media Image-Text Retrieval Combined with Global Similarity and Local Similarity
    Li, Zhixin
    Ling, Feng
    Zhang, Canlong
    2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 145 - 153
  • [2] Cross-Media Similarity Evaluation for Web Image Retrieval in the Wild
    Dong, Jianfeng
    Li, Xirong
    Xu, Duanqing
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (09) : 2371 - 2384
  • [3] Effective Heterogeneous Similarity Measure with Nearest Neighbors for Cross-Media Retrieval
    Zhai, Xiaohua
    Peng, Yuxin
    Xiao, Jianguo
    ADVANCES IN MULTIMEDIA MODELING, 2012, 7131 : 312 - 322
  • [4] Cross-Media Image-Text Retrieval with Two Level Similarity
    Li Z.-X.
    Ling F.
    Zhang C.-L.
    Ma H.-F.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (02): : 268 - 274
  • [5] Cross-Media Retrieval Based on Two-Level Similarity and Collaborative Representation
    Zhang, Jiahua
    TRAITEMENT DU SIGNAL, 2023, 40 (05) : 2161 - 2168
  • [6] Relative image similarity learning with contextual information for Internet cross-media retrieval
    Shuqiang Jiang
    Xinhang Song
    Qingming Huang
    Multimedia Systems, 2014, 20 : 645 - 657
  • [7] Relative image similarity learning with contextual information for Internet cross-media retrieval
    Jiang, Shuqiang
    Song, Xinhang
    Huang, Qingming
    MULTIMEDIA SYSTEMS, 2014, 20 (06) : 645 - 657
  • [8] Tri-space and Ranking Based Heterogeneous Similarity Measure for Cross-Media Retrieval
    Ling, Li
    Zhai, Xiaohua
    Peng, Yuxin
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 230 - 233
  • [9] Efficient Manifold Ranking for Cross-media retrieval
    Ma, ShaoQin
    Zhang, Hong
    PROCEEDINGS OF THE 2018 13TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2018), 2018, : 335 - 340
  • [10] Cross-media Relevance Computation for Multimedia Retrieval
    Dong, Jianfeng
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 831 - 835