PARTCLIP: HOW DOES CLIP ASSIST MECHANICAL PART IMAGE RETRIEVAL?

被引:0
|
作者
Mao, Shangbo [1 ]
Lin, Dongyun [1 ]
Guo, Aiyuan [1 ]
Li, Yiqun [1 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore
关键词
CLIP; image retrieval; knowledge distillation;
D O I
10.1109/ICMEW63481.2024.10645410
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
CLIP demonstrates impressive performance across several downstream tasks, such as zero-shot image classification. However, these tasks typically involve images from everyday scenarios, and the efficacy of CLIP in domain-specific computer vision tasks associated with the manufacturing industry remains unexplored. This paper first investigates how well CLIP understands the mechanical part images from the manufacturing industrial scenes by conducting a thorough evaluation of its performance in the mechanical part image retrieval task. It turns out that direct employment of CLIP is less effective for this task. At the same time, considering the requirement of this task for deployment on the industry platform in a factory, the large size of the CLIP model presents a practical challenge. Therefore, we explore the knowledge distillation techniques to transfer the knowledge of CLIP into a lighter Efficientnet B1. Our experimental results demonstrate that this CLIP-based knowledge distillation approach can enhance the performance of Efficientnet B1 on mechanical part image retrieval significantly.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Multimodal Causal Relations Enhanced CLIP for Image-to-Text Retrieval
    Feng, Wenjun
    Lin, Dazhen
    Cao, Donglin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 210 - 221
  • [2] Extending CLIP for Category-to-Image Retrieval in E-Commerce
    Hendriksen, Mariya
    Bleeker, Maurits
    Vakulenko, Svitlana
    van Noord, Nanne
    Kuiper, Ernst
    de Rijke, Maarten
    ADVANCES IN INFORMATION RETRIEVAL, PT I, 2022, 13185 : 289 - 303
  • [3] Recognizing Cartoon Image Gestures for Retrieval and Interactive Cartoon Clip Synthesis
    Yang, Yi
    Zhuang, Yueting
    Tao, Dacheng
    Xu, Dong
    Yu, Jun
    Luo, Jiebo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2010, 20 (12) : 1745 - 1756
  • [4] How does antigen retrieval work?
    Leong, Trishe Y-M.
    Leong, Anthony S-Y.
    ADVANCES IN ANATOMIC PATHOLOGY, 2007, 14 (02) : 129 - 131
  • [5] How does fake news use a thumbnail? CLIP-based Multimodal Detection on the Unrepresentative News Image
    Choi, Hyewon
    Yoon, Yejun
    Yoon, Seunghyun
    Park, Kunwoo
    PROCEEDINGS OF THE SECOND WORKSHOP ON COMBATING ONLINE HOSTILE POSTS IN REGIONAL LANGUAGES DURING EMERGENCY SITUATIONS (CONSTRAINT 2022), 2022, : 86 - 94
  • [6] Effective conditioned and composed image retrieval combining CLIP-based features
    Baldrati, Alberto
    Bertini, Marco
    Uricchio, Tiberio
    Del Bimbo, Alberto
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 21434 - 21442
  • [7] Optimizing CLIP Models for Image Retrieval with Maintained Joint-Embedding Alignment
    Schall, Konstantin
    Barthel, Kai Uwe
    Hezel, Nico
    Jung, Klaus
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2024, 2025, 15268 : 97 - 110
  • [8] CLIP-Based Composed Image Retrieval with Comprehensive Fusion and Data Augmentation
    Lin, Haoqiang
    Wen, Haokun
    Chen, Xiaolin
    Song, Xuemeng
    ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT I, 2024, 14471 : 190 - 202
  • [9] Does compression affect image retrieval performance?
    Schaefer, Gerald
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2008, 18 (2-3) : 101 - 112
  • [10] HOW TO DO DIVERSIFICATION IN AN IMAGE RETRIEVAL SYSTEM
    Iftene, Adrian
    Siriteanu, Alexandra-Mihaela
    Petic, Mircea
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2014, 2014, : 153 - 162