Supervised Contrastive Learning for 3D Cross-Modal Retrieval

被引:0
|
作者
Choo, Yeon-Seung [1 ]
Kim, Boeun [2 ]
Kim, Hyun-Sik [1 ]
Park, Yong-Suk [1 ]
机构
[1] Korea Elect Technol Inst KETI, Contents Convergence Res Ctr, Seoul 03924, South Korea
[2] Korea Elect Technol Inst KETI, Artificial Intelligence Res Ctr, Seongnam 13509, South Korea
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 22期
关键词
cross-modal; object retrieval; contrastive learning;
D O I
10.3390/app142210322
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Interoperability between different virtual platforms requires the ability to search and transfer digital assets across platforms. Digital assets in virtual platforms are represented in different forms or modalities, such as images, meshes, and point clouds. The cross-modal retrieval of three-dimensional (3D) object representations is challenging due to data representation diversity, making common feature space discovery difficult. Recent studies have been focused on obtaining feature consistency within the same classes and modalities using cross-modal center loss. However, center features are sensitive to hyperparameter variations, making cross-modal center loss susceptible to performance degradation. This paper proposes a new 3D cross-modal retrieval method that uses cross-modal supervised contrastive learning (CSupCon) and the fixed projection head (FPH) strategy. Contrastive learning mitigates the influence of hyperparameters by maximizing feature distinctiveness. The FPH strategy prevents gradient updates in the projection network, enabling the focused training of the backbone networks. The proposed method shows a mean average precision (mAP) increase of 1.17 and 0.14 in 3D cross-modal object retrieval experiments using ModelNet10 and ModelNet40 datasets compared to state-of-the-art (SOTA) methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] ICCL: SELF-SUPERVISED INTRA- AND CROSS-MODAL CONTRASTIVE LEARNING WITH 2D-3D PAIRS FOR 3D SCENE UNDERSTANDING
    Higa, Kyota
    Yamaguchi, Masahiro
    Hosoi, Toshinori
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1085 - 1089
  • [22] A semi-supervised cross-modal memory bank for cross-modal retrieval
    Huang, Yingying
    Hu, Bingliang
    Zhang, Yipeng
    Gao, Chi
    Wang, Quan
    NEUROCOMPUTING, 2024, 579
  • [23] Sketch-Based 3D Shape Retrieval Via Cross-Modal Contrastive Learning and Difficulty-Aware Uncertainty Regularization
    Hou, Wentao
    Diao, Zhenyu
    Peng, Jingliang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VI, 2025, 15036 : 521 - 534
  • [24] SCLAV: Supervised Cross-modal Contrastive Learning for Audio-Visual Coding
    Sun, Chao
    Chen, Min
    Cheng, Jialiang
    Liang, Han
    Zhu, Chuanbo
    Chen, Jincai
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 261 - 270
  • [25] Lip and speech synchronization using supervised contrastive learning and cross-modal attention
    Varshney, Munender
    Mukherji, Mayurakshi
    Senthil, Raja G.
    Ganesh, Ananth
    Banerjee, Kingshuk
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [26] Two-stage deep learning for supervised cross-modal retrieval
    Jie Shao
    Zhicheng Zhao
    Fei Su
    Multimedia Tools and Applications, 2019, 78 : 16615 - 16631
  • [27] Adaptively Unified Semi-supervised Learning for Cross-Modal Retrieval
    Zhang, Liang
    Ma, Bingpeng
    He, Jianfeng
    Li, Guorong
    Huang, Qingming
    Tian, Qi
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3406 - 3412
  • [28] Two-stage deep learning for supervised cross-modal retrieval
    Shao, Jie
    Zhao, Zhicheng
    Su, Fei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (12) : 16615 - 16631
  • [29] Early-Learning regularized Contrastive Learning for Cross-Modal Retrieval with Noisy Labels
    Xu, Tianyuan
    Liu, Xueliang
    Huang, Zhen
    Guo, Dan
    Hong, Richang
    Wang, Meng
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [30] Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval
    Xu, Mengying
    Luo, Linyin
    Lai, Hanjiang
    Yin, Jian
    DATA SCIENCE AND ENGINEERING, 2024, 9 (03) : 251 - 263