HMTN: Hierarchical Multi-scale Transformer Network for 3D Shape Recognition

被引:3
|
作者
Zhao, Yue [1 ,2 ]
Nie, Weizhi [1 ]
Gao, Zan [3 ]
Liu, An-an [1 ,2 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China
[3] Shandong Artificial Intelligence Inst, Jinan, Peoples R China
基金
中国国家自然科学基金;
关键词
3D Shape Recognition; Transformer; Hierarchical Network;
D O I
10.1145/3503161.3548140
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
As an important field of multimedia, 3D shape recognition has attracted much research attention in recent years. Various approaches have been proposed, within which the multiview-based methods show their promising performances. In general, an effective 3D shape recognition algorithm should take both the multiview local and global visual information into consideration, and explore the inherent properties of generated 3D descriptors to guarantee the performance of feature alignment in the common space. To tackle these issues, we propose a novel Hierarchical Multi-scale Transformer Network (HMTN) for the 3D shape recognition task. In HMTN, we propose a multi-level regional transformer (MLRT) module for shape descriptor generation. MLRT includes two branches that aim to extract the intra-view local characteristics by modeling region-wise dependencies and give the supervision of multiview global information under different granularities. Specifically, MLRT can comprehensively consider the relations of different regions and focus on the discriminative parts, which improves the effectiveness of the learned descriptors. Finally, we adopt the cross-granularity contrastive learning (CCL) mechanism for shape descriptor alignment in the common space. It can explore and utilize the cross-granularity semantic correlation to guide the descriptor extraction process while performing the instance alignment based on the category information. We evaluate the proposed network on several public benchmarks, and HMTN achieves competitive performance compared with the state-of-the-art (SOTA) methods.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] A multi-scale hierarchical 3D shape representation for similar shape retrieval
    Iyer, Natraj
    Jayanti, Subramaniam
    Lou, Kuiyang
    Kalyanaraman, Yagnanarayanan
    Ramani, Karthik
    TOOLS AND METHODS OF COMPETITIVE ENGINEERING Vols 1 and 2, 2004, : 1117 - 1118
  • [2] Multi-scale Transformer 3D Plane Recovery
    Ren, Fei
    Chang, Qingling
    Liu, Xinglin
    Cui, Yan
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [3] Multi-Scale Representation Learning on Hypergraph for 3D Shape Retrieval and Recognition
    Bai, Junjie
    Gong, Biao
    Zhao, Yining
    Lei, Fuqiang
    Yan, Chenggang
    Gao, Yue
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 (30) : 5327 - 5338
  • [4] Multi-scale Knowledge Transfer Vision Transformer for 3D vessel shape segmentation
    Hua, Michael J.
    Wu, Junjie
    Zhong, Zichun
    COMPUTERS & GRAPHICS-UK, 2024, 122
  • [5] Multi-Scale Attention 3D Convolutional Network for Multimodal Gesture Recognition
    Chen, Huizhou
    Li, Yunan
    Fang, Huijuan
    Xin, Wentian
    Lu, Zixiang
    Miao, Qiguang
    SENSORS, 2022, 22 (06)
  • [6] Hierarchical parallel multi-scale graph network for 3d human pose estimation
    Yang, Honghong
    Liu, Hongxi
    Zhang, Yumei
    Wu, Xiaojun
    APPLIED SOFT COMPUTING, 2023, 140
  • [7] TMSDNet: Transformer with multi-scale dense network for single and multi-view 3D reconstruction
    Zhu, Xiaoqiang
    Yao, Xinsheng
    Zhang, Junjie
    Zhu, Mengyao
    You, Lihua
    Yang, Xiaosong
    Zhang, Jianjun
    Zhao, He
    Zeng, Dan
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2024, 35 (01)
  • [8] Multi-Scale Spatial Transformer Network for LiDAR-Camera 3D Object Detection
    Wang, Zhifan
    Zhang, Xiaohong
    Wang, Shidong
    Xin, Tong
    Zhang, Haofeng
    Lu, Jianfeng
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [9] 3D mesh transformer: A hierarchical neural network with local shape tokens
    Chen, Yu
    Zhao, Jieyu
    Huang, Lingfeng
    Chen, Hao
    NEUROCOMPUTING, 2022, 514 : 328 - 340
  • [10] Multi-Scale PointPillars 3D Object Detection Network
    Ya, Hang
    Luo, Guiming
    PROCEEDINGS OF THE 2019 IEEE 18TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC 2019), 2019, : 174 - 179