Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features

被引:12
|
作者
Lin, Dongyun [1 ]
Li, Yiqun [1 ]
Cheng, Yi [1 ]
Prasad, Shitala [1 ]
Nwe, Tin Lay [1 ]
Dong, Sheng [1 ]
Guo, Aiyuan [1 ]
机构
[1] ASTAR, Inst Infocomm Res, 1 Fusionopolis Way,21-01 Connexis South Tower, Singapore 138632, Singapore
关键词
View-based 3D object retrieval; View attention module; Instance attention module; ArcFace loss; Cosine distance triplet -center loss; CONVOLUTIONAL NEURAL-NETWORK;
D O I
10.1016/j.knosys.2022.108754
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self -attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. (C)& nbsp;2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] A Multi-View Probabilistic Model for 3D Object Classes
    Sun, Min
    Su, Hao
    Savarese, Silvio
    Li Fei-Fei
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 1247 - +
  • [22] Learning Relationships for Multi-View 3D Object Recognition
    Yang, Ze
    Wang, Liwei
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7504 - 7513
  • [23] Multi-view representation and synthesis for 3D object movie
    Lie, WN
    Wei, BE
    2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2002, : 529 - 532
  • [24] Viewpoint Equivariance for Multi-View 3D Object Detection
    Chen, Dian
    Li, Jie
    Guizilini, Vitor
    Ambrus, Rares
    Gaidon, Adrien
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9213 - 9222
  • [25] CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
    Xiong, Kaixin
    Gong, Shi
    Ye, Xiaoqing
    Tan, Xiao
    Wan, Ji
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21570 - 21579
  • [26] Comparative Study of Multi-View 3D Object Retrieval with Autoencoder & Deep Embedding Network
    Aktar, Sakifa
    Al Mamun, Md
    Hossain, Md Ali
    2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [27] Object-based encoding for multi-view sequences of 3D object
    Yi, J
    Rhee, K
    Kim, S
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2002, 17 (03) : 293 - 304
  • [28] Multi-view object instance recognition in an industrial context
    Mustafa, Wail
    Pugeault, Nicolas
    Buch, Anders G.
    Kruger, Norbert
    ROBOTICA, 2017, 35 (02) : 271 - 292
  • [29] 3D LayoutCRF for multi-view object class recognition and segmentation
    Hoiem, Derek
    Rother, Carsten
    Winn, John
    2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 580 - +
  • [30] Learning Disentangled Representation for Multi-View 3D Object Recognition
    Huang, Jingjia
    Yan, Wei
    Li, Ge
    Li, Thomas
    Liu, Shan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) : 646 - 659