Supervised Contrastive Learning for 3D Cross-Modal Retrieval

被引:0
|
作者
Choo, Yeon-Seung [1 ]
Kim, Boeun [2 ]
Kim, Hyun-Sik [1 ]
Park, Yong-Suk [1 ]
机构
[1] Korea Elect Technol Inst KETI, Contents Convergence Res Ctr, Seoul 03924, South Korea
[2] Korea Elect Technol Inst KETI, Artificial Intelligence Res Ctr, Seongnam 13509, South Korea
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 22期
关键词
cross-modal; object retrieval; contrastive learning;
D O I
10.3390/app142210322
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Interoperability between different virtual platforms requires the ability to search and transfer digital assets across platforms. Digital assets in virtual platforms are represented in different forms or modalities, such as images, meshes, and point clouds. The cross-modal retrieval of three-dimensional (3D) object representations is challenging due to data representation diversity, making common feature space discovery difficult. Recent studies have been focused on obtaining feature consistency within the same classes and modalities using cross-modal center loss. However, center features are sensitive to hyperparameter variations, making cross-modal center loss susceptible to performance degradation. This paper proposes a new 3D cross-modal retrieval method that uses cross-modal supervised contrastive learning (CSupCon) and the fixed projection head (FPH) strategy. Contrastive learning mitigates the influence of hyperparameters by maximizing feature distinctiveness. The FPH strategy prevents gradient updates in the projection network, enabling the focused training of the backbone networks. The proposed method shows a mean average precision (mAP) increase of 1.17 and 0.14 in 3D cross-modal object retrieval experiments using ModelNet10 and ModelNet40 datasets compared to state-of-the-art (SOTA) methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Continual learning in cross-modal retrieval
    Wang, Kai
    Herranz, Luis
    van de Weijer, Joost
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3623 - 3633
  • [42] Learning DALTS for cross-modal retrieval
    Yu, Zheng
    Wang, Wenmin
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2019, 4 (01) : 9 - 16
  • [43] Sequential Learning for Cross-modal Retrieval
    Song, Ge
    Tan, Xiaoyang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4531 - 4539
  • [44] Generalized Semi-supervised and Structured Subspace Learning for Cross-Modal Retrieval
    Zhang, Liang
    Ma, Bingpeng
    Li, Guorong
    Huang, Qingming
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (01) : 128 - 141
  • [45] Supervised Hierarchical Online Hashing for Cross-modal Retrieval
    Han, Kai
    Liu, Yu
    Wei, Rukai
    Zhou, Ke
    Xu, Jinhui
    Long, Kun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (04)
  • [46] SEMANTICALLY SUPERVISED MAXIMAL CORRELATION FOR CROSS-MODAL RETRIEVAL
    Li, Mingyang
    Li, Yang
    Huang, Shao-Lun
    Zhang, Lin
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2291 - 2295
  • [47] Supervised Matrix Factorization Hashing for Cross-Modal Retrieval
    Tang, Jun
    Wang, Ke
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3157 - 3166
  • [48] Discriminative correlation hashing for supervised cross-modal retrieval
    Lu, Xu
    Zhang, Huaxiang
    Sun, Jiande
    Wang, Zhenhua
    Guo, Peilian
    Wan, Wenbo
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 65 : 221 - 230
  • [49] Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval
    Zhan, Yu-Wei
    Luo, Xin
    Wang, Yongxin
    Xu, Xin-Shun
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3386 - 3394
  • [50] Discrete Robust Supervised Hashing for Cross-Modal Retrieval
    Yao, Tao
    Zhang, Zhiwang
    Yan, Lianshan
    Yue, Jun
    Tian, Qi
    IEEE ACCESS, 2019, 7 : 39806 - 39814