Fine-tuning 3D foundation models for geometric object retrieval

被引:0
|
作者
Van den Herrewegen, Jarne [1 ,2 ]
Tourwe, Tom [1 ]
Ovsjanikov, Maks [3 ]
Wyffels, Francis [2 ]
机构
[1] Oqton AI, Edegem, Belgium
[2] Ghent Univ Imec, AI & Robot Lab, IDLab AIRO, Zwijnaarde, Belgium
[3] Ecole Polytech, LIX, Palaiseau, France
来源
COMPUTERS & GRAPHICS-UK | 2024年 / 122卷
关键词
Object retrieval; Deep learning; 3D; Transfer learning; Foundation models; Self-supervised learning; NEURAL-NETWORK;
D O I
10.1016/j.cag.2024.103993
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Foundation models, such as ULIP-2 (Xue et al., 2023) recently projected forward the field of 3D deep learning. These models are trained with significantly more data and show superior representation learning capacity in many downstream tasks like 3D shape classification and few-shot part segmentation. A particular characteristic of the recent 3D foundation models is that they are typically multi-modal, , and involve image (2D) as well as caption (text) branches. This leads to an intricate interplay that benefits all modalities. At the same time, the nature of the 3D encoders alone, involved in these foundation models is not well-understood. Specifically, there is little analysis on the utility of both pre-trained 3D features provided by these models, or their capacity to adapt to new downstream 3D data. Furthermore, existing studies typically focus on label-oriented downstream tasks, such as shape classification, and ignore other critical applications, such as 3D content-based object retrieval. In this paper, we fill this gap and show, for the first time, how 3D foundation models can be leveraged for strong 3D-to-3D retrieval performance on seven different datasets, on par with state-of-the-art view-based architectures. We evaluate both the pre-trained foundation models, as well as their fine-tuned versions using downstream data. We compare supervised fine-tuning using classification labels against two self-supervised label-free fine-tuning methods. Importantly, we introduce and describe a methodology for fine-tuning, as we found this to be crucial to make transfer learning from 3D foundation models work in a stable manner.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Ymir: A Scheduler for Foundation Model Fine-tuning Workloads in Datacenters
    Gao, Wei
    Zhuang, Weiming
    Li, Minghao
    Sun, Peng
    Wen, Yonggang
    Zhang, Tianwei
    PROCEEDINGS OF THE 38TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACM ICS 2024, 2024, : 259 - 271
  • [32] Fine-tuning growth in gold nanostructures from achiral 2D to chiral 3D geometries
    Tan, Lili
    Chen, Zhi
    Xiao, Chengyu
    Geng, Zhiyong
    Jin, Yinran
    Wei, Chaoyang
    Teng, Fei
    Fu, Wenlong
    Wang, Peng-peng
    NANO RESEARCH, 2024, 17 (07) : 6654 - 6660
  • [33] Fine-Tuning LLaMA for Multi-Stage Text Retrieval
    Ma, Xueguang
    Wang, Liang
    Yang, Nan
    Wei, Furu
    Lin, Jimmy
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2421 - 2425
  • [34] Topological RANSAC for instance verification and retrieval without fine-tuning
    An, Guoyuan
    Seon, Juhyung
    An, InKyu
    Huo, Yuchi
    Yoon, Sung-Eui
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [35] Structured Fine-Tuning of Contextual Embeddings for Effective Biomedical Retrieval
    Ueda, Alberto
    Santos, Rodrygo L. T.
    Macdonald, Craig
    Ounis, Iadh
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2031 - 2035
  • [36] 3D MESH OBJECT RETRIEVAL BY DISCRETE AND CONTINUOUS HIDDEN MARKOV MODELS
    Mahjoub, Mohamed Ali
    Abbassi, Malek
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2012, 12 (04)
  • [37] 3D Object Recognition by Geometric Hashing
    Eskizara, Omer
    Akagunduz, Erdem
    Ulusoy, Ilkay
    2009 IEEE 17TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 214 - 217
  • [38] A geometric approach to 3D object comparison
    Novotni, M
    Klein, R
    INTERNATIONAL CONFERENCE ON SHAPE MODELING AND APPLICATIONS, PROCEEDING, 2001, : 167 - 175
  • [39] 3D object retrieval by bipartite matching
    Pan, X
    Zhang, Y
    Ye, XZ
    Zhang, SY
    DIGITAL LIBRARIES: INTERNATIONAL COLLABORATION AND CROSS-FERTILIZATION, PROCEEDINGS, 2004, 3334 : 640 - 640
  • [40] Semantic Enabled 3D Object Retrieval
    Zhou, Jiang
    Ma, Xinyu
    MICRO NANO DEVICES, STRUCTURE AND COMPUTING SYSTEMS, 2011, 159 : 128 - 131