Fine-tuning 3D foundation models for geometric object retrieval

被引:0
|
作者
Van den Herrewegen, Jarne [1 ,2 ]
Tourwe, Tom [1 ]
Ovsjanikov, Maks [3 ]
Wyffels, Francis [2 ]
机构
[1] Oqton AI, Edegem, Belgium
[2] Ghent Univ Imec, AI & Robot Lab, IDLab AIRO, Zwijnaarde, Belgium
[3] Ecole Polytech, LIX, Palaiseau, France
来源
COMPUTERS & GRAPHICS-UK | 2024年 / 122卷
关键词
Object retrieval; Deep learning; 3D; Transfer learning; Foundation models; Self-supervised learning; NEURAL-NETWORK;
D O I
10.1016/j.cag.2024.103993
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Foundation models, such as ULIP-2 (Xue et al., 2023) recently projected forward the field of 3D deep learning. These models are trained with significantly more data and show superior representation learning capacity in many downstream tasks like 3D shape classification and few-shot part segmentation. A particular characteristic of the recent 3D foundation models is that they are typically multi-modal, , and involve image (2D) as well as caption (text) branches. This leads to an intricate interplay that benefits all modalities. At the same time, the nature of the 3D encoders alone, involved in these foundation models is not well-understood. Specifically, there is little analysis on the utility of both pre-trained 3D features provided by these models, or their capacity to adapt to new downstream 3D data. Furthermore, existing studies typically focus on label-oriented downstream tasks, such as shape classification, and ignore other critical applications, such as 3D content-based object retrieval. In this paper, we fill this gap and show, for the first time, how 3D foundation models can be leveraged for strong 3D-to-3D retrieval performance on seven different datasets, on par with state-of-the-art view-based architectures. We evaluate both the pre-trained foundation models, as well as their fine-tuned versions using downstream data. We compare supervised fine-tuning using classification labels against two self-supervised label-free fine-tuning methods. Importantly, we introduce and describe a methodology for fine-tuning, as we found this to be crucial to make transfer learning from 3D foundation models work in a stable manner.
引用
收藏
页数:10
相关论文
共 50 条
  • [11] Fine-tuning constraints on supergravity models
    Bastero-Gil, M
    Kane, GL
    King, SF
    PHYSICS LETTERS B, 2000, 474 (1-2) : 103 - 112
  • [12] 3D sketching for 3D object retrieval
    Bo Li
    Juefei Yuan
    Yuxiang Ye
    Yijuan Lu
    Chaoyang Zhang
    Qi Tian
    Multimedia Tools and Applications, 2021, 80 : 9569 - 9595
  • [13] Full fine-tuning strategy for endoscopic foundation models with expanded learnable offset parameters
    Dong, Minghan
    Zheng, Xiangwei
    Zhang, Xia
    Zhang, Xingyu
    Zhang, Mingzhe
    BIOMEDICAL PHYSICS & ENGINEERING EXPRESS, 2025, 11 (02):
  • [14] Titan: A Scheduler for Foundation Model Fine-tuning Workloads
    Gao, Wei
    Sun, Peng
    Wen, Yonggang
    Zhangi, Tianwei
    PROCEEDINGS OF THE 13TH SYMPOSIUM ON CLOUD COMPUTING, SOCC 2022, 2022, : 348 - 354
  • [15] ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation
    Shi, Ruohua
    Pang, Qiufan
    Ma, Lei
    Duan, Lingyu
    Huang, Tiejun
    Jiang, Tingting
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII, 2024, 15012 : 731 - 741
  • [16] Fine-Tuning CNN Image Retrieval with No Human Annotation
    Radenovic, Filip
    Tolias, Giorgos
    Chum, Ondrej
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (07) : 1655 - 1668
  • [17] Revisiting 3D Geometric Models for Accurate Object Shape and Pose
    Zia, M. Zeeshan
    Stark, Michael
    Schiele, Bernt
    Schindler, Konrad
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [18] Geometric Verification Using Semi-2D Constraints for 3D Object Retrieval
    Matsuzaki, Kohei
    Uchida, Yusuke
    Sakazawa, Shigeyuki
    Sato, Shin'ichi
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 2338 - 2343
  • [19] On efficient 3D object retrieval
    Liu, Hao
    Wong, Raymond Chi-Wing
    VLDB JOURNAL, 2025, 34 (01):
  • [20] On 3D Object Retrieval Benchmarking
    Koutsoudis, Anestis
    Pratikakis, Ioannis
    Chamzas, Christodoulos
    3D RESEARCH, 2013, 4 (04): : 1 - 12