Learning visual similarity for image retrieval with global descriptors and capsule networks

被引:0
|
作者
Durmus, Duygu [1 ]
Gudukbay, Ugur [1 ]
Ulusoy, Ozgur [1 ]
机构
[1] Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye
关键词
Deep learning; Neural networks; Capsule networks; Global descriptors; Image retrieval; Triplet loss; Cost-sensitive regularized cross-entropy loss;
D O I
10.1007/s11042-023-16164-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finding matching images across large and unstructured datasets is vital in many computer vision applications. With the emergence of deep learning-based solutions, various visual tasks, such as image retrieval, have been successfully addressed. Learning visual similarity is crucial for image matching and retrieval tasks. Capsule Networks enable learning richer information that describes the object without losing the essential spatial relationship between the object and its parts. Besides, global descriptors are widely used for representing images. We propose a framework that combines the power of global descriptors and Capsule Networks by benefiting from the information of multiple views of images to enhance the image retrieval performance. The Spatial Grouping Enhance strategy, which enhances sub-features parallelly, and self-attention layers, which explore global dependencies within internal representations of images, are utilized to empower the image representations. The approach captures resemblances between similar images and differences between non-similar images using triplet loss and cost-sensitive regularized cross-entropy loss. The results are superior to the state-of-the-art approaches for the Stanford Online Products Database with Recall@K of 85.0, 94.4, 97.8, and 99.3, where K is 1, 10, 100, and 1000, respectively.
引用
收藏
页码:20243 / 20263
页数:21
相关论文
共 50 条
  • [21] Sparse Similarity Matrix Learning for Visual Object Retrieval
    Yan, Zhicheng
    Yu, Yizhou
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [22] Learning Semantic and Visual Similarity for Endomicroscopy Video Retrieval
    Andre, Barbara
    Vercauteren, Tom
    Buchner, Anna M.
    Wallace, Michael B.
    Ayache, Nicholas
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2012, 31 (06) : 1276 - 1288
  • [23] Combining visual dictionary, kernel-based similarity and learning strategy for image category retrieval
    Gosselin, Philippe Henri
    Cord, Matthieu
    Philipp-Foliguet, Sylvie
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) : 403 - 417
  • [24] Enhancing Scalability of Image Retrieval Using Visual Fusion of Feature Descriptors
    BalammalGeetha, S.
    Muthukkumar, R.
    Seenivasagam, V
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 31 (03): : 1737 - 1752
  • [25] Composite Descriptors and Deep Features Based Visual Phrase for Image Retrieval
    Wang, Yanhong
    Zhang, Linna
    Cen, Yigang
    Zhao, Ruizhen
    Chai, Tingting
    Cen, Yi
    CLOUD COMPUTING AND SECURITY, PT VI, 2018, 11068 : 476 - 486
  • [26] Performance Evaluation of Visual Descriptors for Image Indexing in Content Based Image Retrieval Systems
    Adegbola, Oluwole A.
    Aborisade, David O.
    Popoola, Segun I.
    Atayero, Aderemi A.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2018, PT IV, 2018, 10963 : 539 - 549
  • [27] Performance analysis of various local and global shape descriptors for image retrieval
    Chandan Singh
    Pooja Sharma
    Multimedia Systems, 2013, 19 : 339 - 357
  • [28] Comparative study of global color and texture descriptors for web image retrieval
    Penatti, Otavio A. B.
    Valle, Eduardo
    Torres, Ricardo da S.
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2012, 23 (02) : 359 - 380
  • [29] VisHash: Visual Similarity Preserving Image Hashing for Diagram Retrieval
    Oyen, Diane
    Kucer, Michal
    Wohlberg, Brendt
    APPLICATIONS OF MACHINE LEARNING 2021, 2021, 11843
  • [30] Image retrieval using global descriptors and multiple clustering in Nash game
    Bencharef, O.
    Jarmouni, B.
    Moussaid, N.
    Souissi, A.
    ANNALS OF THE UNIVERSITY OF CRAIOVA-MATHEMATICS AND COMPUTER SCIENCE SERIES, 2015, 42 (01): : 202 - 210