PARTCLIP: HOW DOES CLIP ASSIST MECHANICAL PART IMAGE RETRIEVAL?

被引:0
|
作者
Mao, Shangbo [1 ]
Lin, Dongyun [1 ]
Guo, Aiyuan [1 ]
Li, Yiqun [1 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore
关键词
CLIP; image retrieval; knowledge distillation;
D O I
10.1109/ICMEW63481.2024.10645410
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
CLIP demonstrates impressive performance across several downstream tasks, such as zero-shot image classification. However, these tasks typically involve images from everyday scenarios, and the efficacy of CLIP in domain-specific computer vision tasks associated with the manufacturing industry remains unexplored. This paper first investigates how well CLIP understands the mechanical part images from the manufacturing industrial scenes by conducting a thorough evaluation of its performance in the mechanical part image retrieval task. It turns out that direct employment of CLIP is less effective for this task. At the same time, considering the requirement of this task for deployment on the industry platform in a factory, the large size of the CLIP model presents a practical challenge. Therefore, we explore the knowledge distillation techniques to transfer the knowledge of CLIP into a lighter Efficientnet B1. Our experimental results demonstrate that this CLIP-based knowledge distillation approach can enhance the performance of Efficientnet B1 on mechanical part image retrieval significantly.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Left ventricular assist device recovery: does duration of mechanical support matter?
    Pham, Binh N.
    Chaparro, Sandra V.
    HEART FAILURE REVIEWS, 2019, 24 (02) : 237 - 244
  • [42] A beowulf class parallel remote sensed image database retrieval system developed in ASSIST environment
    Di Lecce, V
    Guerriero, A
    Guarino, I
    STORAGE AND RETRIEVAL METHODS AND APPLICATIONS FOR MULTIMEDIA 2005, 2005, 5682 : 1 - 9
  • [44] How does the carboxyl terminus assist folding and ER export of the serotonin transporter?
    Ali El-Kasaby
    Florian Koban
    Michael Freissmuth
    Sonja Sucic
    BMC Pharmacology and Toxicology, 13 (Suppl 1)
  • [45] Task-like training paradigm in CLIP for zero-shot sketch-based image retrieval
    Zhang, Haoxiang
    Cheng, Deqiang
    Jiang, He
    Liu, Jingjing
    Kou, Qiqi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (19) : 57811 - 57828
  • [46] CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not
    Sain, Aneeshan
    Bhunia, Ayan Kumar
    Chowdhury, Pinaki Nath
    Koley, Subhadeep
    Xiang, Tao
    Song, Yi-Zhe
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2765 - 2775
  • [47] How does Brand Extension Affect Brand Image?
    Hariri, Mahsa
    Vazifehdust, Hossein
    BUSINESS AND ECONOMICS RESEARCH, 2011, 1 : 104 - 109
  • [48] How many photons does it take to form an image?
    Johnson, Steven D.
    Moreau, Paul-Antoine
    Gregory, Thomas
    Padgett, Miles J.
    APPLIED PHYSICS LETTERS, 2020, 116 (26)
  • [49] How does the brain represent the semantic content of an image?
    Xu, Huawei
    Liu, Ming
    Zhang, Delong
    NEURAL NETWORKS, 2022, 154 : 31 - 42
  • [50] How artificial intelligence and machine learning assist in industry 4.0 for mechanical engineers
    Shankarrao Patange G.
    Bharatkumar Pandya A.
    Materials Today: Proceedings, 2023, 72 : 622 - 625