Learning Disentangled Representation for Multi-View 3D Object Recognition

被引:23
|
作者
Huang, Jingjia [1 ]
Yan, Wei [1 ]
Li, Ge [1 ]
Li, Thomas [2 ]
Liu, Shan [3 ]
机构
[1] Peking Univ, Sch Elect & Comp Engn, Shenzhen Grad Sch, Shenzhen 518055, Peoples R China
[2] Peking Univ, AIIT, Hangzhou 100871, Peoples R China
[3] Tencent Media Lab, Palo Alto, CA 94301 USA
关键词
Three-dimensional displays; Solid modeling; Feature extraction; Task analysis; Computer architecture; Object recognition; Computational modeling; Multi-view 3D object; object recognition; disentangled representation; FEATURES;
D O I
10.1109/TCSVT.2021.3062190
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
3D object recognition is a hot research topic. Particularly, view-based methods, which represent a 3D object with a collection of its rendered views on the 2D domain, play an important role in this field. Currently, view-based researches tend to aggregate information from multiple views via pooling based strategies to endow the models with the characteristic of view permutation invariance, at the cost of inevitable loss of useful features. In this paper, we introduce a new method that learns a more comprehensive descriptor for a 3D object from its views while successfully keeping its robustness to the variation of view permutation. Our method disentangles the information in the set of multi-view images into a global category-related feature and a set of view-permutation related features. To unbind these two parts, an encode-decoder based disentangling architecture is proposed, which barely bring extra computations compared to the baseline model. Systematic experiments are conducted for this new method to demonstrates the effectiveness and the competitive performance based on ModelNet40, ModelNet10, and ShapeNetCore55 datasets. Codes for our paper will be released soon on "https://github.com/hjjpku/multi_view_sort".
引用
收藏
页码:646 / 659
页数:14
相关论文
共 50 条
  • [31] Learning Multi-View Representation With LSTM for 3-D Shape Recognition and Retrieval
    Ma, Chao
    Guo, Yulan
    Yang, Jungang
    An, Wei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (05) : 1169 - 1182
  • [32] View-relation constrained global representation learning for multi-view-based 3D object recognition
    Xu, Ruchang
    Mi, Qing
    Ma, Wei
    Zha, Hongbin
    APPLIED INTELLIGENCE, 2023, 53 (07) : 7741 - 7750
  • [33] View-relation constrained global representation learning for multi-view-based 3D object recognition
    Ruchang Xu
    Qing Mi
    Wei Ma
    Hongbin Zha
    Applied Intelligence, 2023, 53 : 7741 - 7750
  • [34] Learning high-level features by fusing multi-view representation of MLS point clouds for 3D object recognition in road environments
    Luo, Zhipeng
    Li, Jonathan
    Xiao, Zhenlong
    Mou, Z. Geroge
    Cai, Xiaojie
    Wang, Cheng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 150 : 44 - 58
  • [35] ReINView: Re-interpreting Views for Multi-view 3D Object Recognition
    Xu, Ruchang
    Ma, Wei
    Mi, Qing
    Zha, Hongbin
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 6630 - 6636
  • [36] An Efficient Multi-view 3D Object Recognition Mechanism for Distributed Edge Devices
    Yang, Li
    Hu, Nan
    Gao, Fei
    Shen, Gang
    2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022), 2022, : 250 - 254
  • [37] 3D object recognition based on pairwise Multi-view Convolutional Neural Networks
    Gao, Z.
    Wang, D. Y.
    Xue, Y. B.
    Xu, G. P.
    Zhang, H.
    Wang, Y. L.
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 56 : 305 - 315
  • [38] iMVS: Integrating multi-view information on multiple scales for 3D object recognition ☆
    Jiang, Jiaqin
    Liu, Zhao
    Li, Jie
    Tu, Jingmin
    Li, Li
    Yao, Jian
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
  • [39] Contactless and partial 3D fingerprint recognition using multi-view deep representation
    Lin, Chenhao
    Kumar, Ajay
    PATTERN RECOGNITION, 2018, 83 : 314 - 327
  • [40] NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection
    Xu, Chenfeng
    Wu, Bichen
    Hou, Ji
    Tsai, Sam
    Li, Ruilong
    Wang, Jialiang
    Zhan, Wei
    He, Zijian
    Vajda, Peter
    Keutzer, Kurt
    Tomizuka, Masayoshi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 23263 - 23273