Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features

被引:12
|
作者
Lin, Dongyun [1 ]
Li, Yiqun [1 ]
Cheng, Yi [1 ]
Prasad, Shitala [1 ]
Nwe, Tin Lay [1 ]
Dong, Sheng [1 ]
Guo, Aiyuan [1 ]
机构
[1] ASTAR, Inst Infocomm Res, 1 Fusionopolis Way,21-01 Connexis South Tower, Singapore 138632, Singapore
关键词
View-based 3D object retrieval; View attention module; Instance attention module; ArcFace loss; Cosine distance triplet -center loss; CONVOLUTIONAL NEURAL-NETWORK;
D O I
10.1016/j.knosys.2022.108754
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self -attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. (C)& nbsp;2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Multi-View Attentive Contextualization for Multi-View 3D Object Detection
    Liu, Xianpeng
    Zheng, Ce
    Qian, Ming
    Xue, Nan
    Chen, Chen
    Zhang, Zhebin
    Li, Chen
    Wu, Tianfu
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 16688 - 16698
  • [2] A Compact Multi-View Descriptor for 3D Object Retrieval
    Daras, Petros
    Axenopoulos, Apostolos
    CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 115 - 119
  • [3] AMVFNet: Attentive Multi-View Fusion Network for 3D Object Detection
    Huang, Yuxiao
    Huang, Zhicong
    Zhao, Jingwen
    Hu, Haifeng
    Chen, Dihu
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2025, 21 (01)
  • [4] Multi-View 3D Object Retrieval With Deep Embedding Network
    Guo, Haiyun
    Wang, Jinqiao
    Gao, Yue
    Li, Jianqiang
    Lu, Hanqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (12) : 5526 - 5537
  • [5] Multi-view and multivariate gaussian descriptor for 3D object retrieval
    Zan Gao
    Kai-Xin Xue
    Hua Zhang
    Multimedia Tools and Applications, 2019, 78 : 555 - 572
  • [6] Multi-view and multivariate gaussian descriptor for 3D object retrieval
    Gao, Zan
    Xue, Kai-Xin
    Zhang, Hua
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (01) : 555 - 572
  • [7] Emphasizing 3D Properties in Recurrent Multi-View Aggregation for 3D Shape Retrieval
    Xu, Cheng
    Leng, Biao
    Zhang, Cheng
    Zhou, Xiaochen
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7428 - 7435
  • [8] 3D Object Retrieval Based on Multi-View Latent Variable Model
    Liu, An-An
    Nie, Wei-Zhi
    Su, Yu-Ting
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (03) : 868 - 880
  • [9] Multi-View Hierarchical Fusion Network for 3D Object Retrieval and Classification
    Liu, An-An
    Hu, Nian
    Song, Dan
    Guo, Fu-Bin
    Zhou, He-Yu
    Hao, Tong
    IEEE ACCESS, 2019, 7 : 153021 - 153030
  • [10] 3D object retrieval based on multi-view convolutional neural networks
    Xi-Xi Li
    Qun Cao
    Sha Wei
    Multimedia Tools and Applications, 2017, 76 : 20111 - 20124