Learning visual similarity for image retrieval with global descriptors and capsule networks

被引:0
|
作者
Durmus, Duygu [1 ]
Gudukbay, Ugur [1 ]
Ulusoy, Ozgur [1 ]
机构
[1] Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye
关键词
Deep learning; Neural networks; Capsule networks; Global descriptors; Image retrieval; Triplet loss; Cost-sensitive regularized cross-entropy loss;
D O I
10.1007/s11042-023-16164-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finding matching images across large and unstructured datasets is vital in many computer vision applications. With the emergence of deep learning-based solutions, various visual tasks, such as image retrieval, have been successfully addressed. Learning visual similarity is crucial for image matching and retrieval tasks. Capsule Networks enable learning richer information that describes the object without losing the essential spatial relationship between the object and its parts. Besides, global descriptors are widely used for representing images. We propose a framework that combines the power of global descriptors and Capsule Networks by benefiting from the information of multiple views of images to enhance the image retrieval performance. The Spatial Grouping Enhance strategy, which enhances sub-features parallelly, and self-attention layers, which explore global dependencies within internal representations of images, are utilized to empower the image representations. The approach captures resemblances between similar images and differences between non-similar images using triplet loss and cost-sensitive regularized cross-entropy loss. The results are superior to the state-of-the-art approaches for the Stanford Online Products Database with Recall@K of 85.0, 94.4, 97.8, and 99.3, where K is 1, 10, 100, and 1000, respectively.
引用
收藏
页码:20243 / 20263
页数:21
相关论文
共 50 条
  • [1] Learning visual similarity for image retrieval with global descriptors and capsule networks
    Duygu Durmuş
    Uğur Güdükbay
    Özgür Ulusoy
    Multimedia Tools and Applications, 2024, 83 : 20243 - 20263
  • [2] Learning non-metric visual similarity for image retrieval
    Garcia, Noa
    Vogiatzis, George
    IMAGE AND VISION COMPUTING, 2019, 82 : 18 - 25
  • [3] Fashion Image Retrieval with Capsule Networks
    Kinli, Furkan
    Ozcan, Baris
    Kirac, Furkan
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3109 - 3112
  • [4] A Hybrid Approach for Image Retrieval Using Visual Descriptors
    Jayaswal, Ruchi
    Jha, Jaimala
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 1125 - 1130
  • [5] Learning similarity for texture image retrieval
    Guo, GD
    Li, SZ
    Chan, KL
    COMPUTER VISION - ECCV 2000, PT I, PROCEEDINGS, 2000, 1842 : 178 - 190
  • [6] Image retrieval based on similarity learning
    El-Naqa, I
    Wernick, MN
    Yang, YY
    Galatsanos, NP
    2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2000, : 722 - 725
  • [7] Active learning for image retrieval via visual similarity metrics and semantic features
    Casado-Coscolla, Alvaro
    Sanchez-Belenguer, Carlos
    Wolfart, Erik
    Angorrilla-Bustamante, Carlos
    Sequeira, Vitor
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138
  • [8] Graph Fusion Using Global Descriptors for Image Retrieval
    Mardones, Tomas
    Allende, Hector
    Moraga, Claudio
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 290 - 297
  • [9] Evaluation of Global Descriptors for Large Scale Image Retrieval
    Wang, Hai
    Zhang, Shuwu
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2011, PT I, 2011, 6978 : 626 - 635
  • [10] Learning Food Image Similarity for Food Image Retrieval
    Shimoda, Wataru
    Yanai, Keiji
    2017 IEEE THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2017), 2017, : 165 - 168