Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features

被引：12

作者：

Lin, Dongyun ^{[1
]}

Li, Yiqun ^{[1
]}

Cheng, Yi ^{[1
]}

Prasad, Shitala ^{[1
]}

Nwe, Tin Lay ^{[1
]}

Dong, Sheng ^{[1
]}

Guo, Aiyuan ^{[1
]}

机构：

[1] ASTAR, Inst Infocomm Res, 1 Fusionopolis Way,21-01 Connexis South Tower, Singapore 138632, Singapore

来源：

KNOWLEDGE-BASED SYSTEMS | 2022年 / 247卷

关键词：

View-based 3D object retrieval; View attention module; Instance attention module; ArcFace loss; Cosine distance triplet -center loss; CONVOLUTIONAL NEURAL-NETWORK;

D O I：

10.1016/j.knosys.2022.108754

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self -attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. (C)& nbsp;2022 Elsevier B.V. All rights reserved.

引用

页数：12

共 50 条

[1] Multi-View Attentive Contextualization for Multi-View 3D Object Detection
Liu, Xianpeng
Zheng, Ce
Qian, Ming
Xue, Nan
Chen, Chen
Zhang, Zhebin
Li, Chen
Wu, Tianfu
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 16688 - 16698
[2] A Compact Multi-View Descriptor for 3D Object Retrieval
Daras, Petros
Axenopoulos, Apostolos
CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 115 - 119
[3] AMVFNet: Attentive Multi-View Fusion Network for 3D Object Detection
Huang, Yuxiao
Huang, Zhicong
Zhao, Jingwen
Hu, Haifeng
Chen, Dihu
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2025, 21 (01)
[4] Multi-View 3D Object Retrieval With Deep Embedding Network
Guo, Haiyun
Wang, Jinqiao
Gao, Yue
Li, Jianqiang
Lu, Hanqing
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (12) : 5526 - 5537
[5] Multi-view and multivariate gaussian descriptor for 3D object retrieval
Zan Gao
Kai-Xin Xue
Hua Zhang
Multimedia Tools and Applications, 2019, 78 : 555 - 572
[6] Multi-view and multivariate gaussian descriptor for 3D object retrieval
Gao, Zan
Xue, Kai-Xin
Zhang, Hua
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (01) : 555 - 572
[7] Emphasizing 3D Properties in Recurrent Multi-View Aggregation for 3D Shape Retrieval
Xu, Cheng
Leng, Biao
Zhang, Cheng
Zhou, Xiaochen
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7428 - 7435
[8] 3D Object Retrieval Based on Multi-View Latent Variable Model
Liu, An-An
Nie, Wei-Zhi
Su, Yu-Ting
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (03) : 868 - 880
[9] Multi-View Hierarchical Fusion Network for 3D Object Retrieval and Classification
Liu, An-An
Hu, Nian
Song, Dan
Guo, Fu-Bin
Zhou, He-Yu
Hao, Tong
IEEE ACCESS, 2019, 7 : 153021 - 153030
[10] 3D object retrieval based on multi-view convolutional neural networks
Xi-Xi Li
Qun Cao
Sha Wei
Multimedia Tools and Applications, 2017, 76 : 20111 - 20124

← 1 2 3 4 5 →