PARTCLIP: HOW DOES CLIP ASSIST MECHANICAL PART IMAGE RETRIEVAL?

被引:0
|
作者
Mao, Shangbo [1 ]
Lin, Dongyun [1 ]
Guo, Aiyuan [1 ]
Li, Yiqun [1 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore
关键词
CLIP; image retrieval; knowledge distillation;
D O I
10.1109/ICMEW63481.2024.10645410
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
CLIP demonstrates impressive performance across several downstream tasks, such as zero-shot image classification. However, these tasks typically involve images from everyday scenarios, and the efficacy of CLIP in domain-specific computer vision tasks associated with the manufacturing industry remains unexplored. This paper first investigates how well CLIP understands the mechanical part images from the manufacturing industrial scenes by conducting a thorough evaluation of its performance in the mechanical part image retrieval task. It turns out that direct employment of CLIP is less effective for this task. At the same time, considering the requirement of this task for deployment on the industry platform in a factory, the large size of the CLIP model presents a practical challenge. Therefore, we explore the knowledge distillation techniques to transfer the knowledge of CLIP into a lighter Efficientnet B1. Our experimental results demonstrate that this CLIP-based knowledge distillation approach can enhance the performance of Efficientnet B1 on mechanical part image retrieval significantly.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] How Does It Work? Part 3: Autosamplers
    Dolan, John W.
    LC GC EUROPE, 2016, 29 (07) : 370 - 374
  • [22] How Does It Work? Part III: Autosamplers
    Dolan, John W.
    LC GC NORTH AMERICA, 2016, 34 (07) : 472 - 478
  • [23] How Does It Work? Part 1: Pumps
    Dolan, John W.
    LC GC EUROPE, 2016, 29 (05) : 258 - 261
  • [24] How does it feel to be part of a virtual seminar
    Schmidtmann, H
    Grothe, S
    GRUPPENDYNAMIK UND ORGANISATIONSBERATUNG, 2001, 32 (02): : 177 - 190
  • [25] Deep Part-Based Image Feature for Clothing Retrieval
    Zhou, Laiping
    Zhou, Zhengzhong
    Zhang, Liqing
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 340 - 347
  • [28] How does the ultrasonic assist CO2 immiscible flooding?
    Wang, Hengli
    Tian, Leng
    Zhen, Yanzhong
    Li, Yating
    Gao, Yi
    Zhong, Gaorun
    Zhang, Kaiqiang
    ULTRASONICS SONOCHEMISTRY, 2025, 114
  • [29] Conditioned and composed image retrieval combining and partially fine-tuning CLIP-based features
    Baldrati, Alberto
    Bertini, Marco
    Uricchio, Tiberio
    Del Bimbo, Alberto
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4955 - 4964
  • [30] VideoCLIP: A Cross-Attention Model for Fast Video-Text Retrieval Task with Image CLIP
    Li, Yikang
    Hsiao, Jenhao
    Ho, Chiuman
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 29 - 33