Point-MPP: Point Cloud Self-Supervised Learning From Masked Position Prediction

被引:0
|
作者
Fan, Songlin [1 ,2 ]
Gao, Wei [1 ,2 ]
Li, Ge [1 ]
机构
[1] Peking Univ, Sch Elect & Comp Engn, Shenzhen 518055, Peoples R China
[2] Peng Cheng Lab, Sch Elect & Comp Engn, Shenzhen 518066, Peoples R China
关键词
Point cloud compression; Semantics; Transformers; Standards; Feature extraction; Training; Circuit faults; Predictive models; Encoding; Image reconstruction; Masked position prediction; point cloud; pretraining; self-supervised learning (SSL);
D O I
10.1109/TNNLS.2024.3479309
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Masked autoencoding has gained momentum for improving fine-tuning performance in many downstream tasks. However, it tends to focus on low-level reconstruction details, lacking high-level semantics and resulting in weak transfer capability. This article presents a novel jigsaw puzzle solver inspired by the idea that predicting the positions of disordered point cloud patches provides more semantic information, similar to how children learn by solving jigsaw puzzles. Our method adopts the mask-then-predict paradigm, erasing the positions of selected point patches rather than their contents. We first partition input point clouds into irregular patches and randomly erase the positions of some patches. Then, a Transformer-based model is used to learn high-level semantic features and regress the positions of the masked patches. This approach forces the model to focus on learning transfer-robust semantics while paying less attention to low-level details. To tie the predictions within the encoding space, we further introduce a consistency constraint on their latent representations to encourage the encoded features to contain more semantic cues. We demonstrate that a standard Transformer backbone with our pretraining scheme can capture discriminative point cloud semantic information. Furthermore, extensive experiments indicate that our method outperforms the previous best competitor across six popular downstream vision tasks, achieving new state-of-the-art performance.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] DHGCN: Dynamic Hop Graph Convolution Network for Self-Supervised Point Cloud Learning
    Jiang, Jincen
    Zhao, Lizhi
    Lu, Xuequan
    Hu, Wei
    Razzak, Imran
    Wang, Meili
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12883 - 12891
  • [32] SSL-Net: Point-Cloud Generation Network With Self-Supervised Learning
    Sun, Ran
    Gao, Yongbin
    Fang, Zhijun
    Wang, Anjie
    Zhong, Cengsi
    IEEE ACCESS, 2019, 7 : 82206 - 82217
  • [33] Self-supervised Adversarial Masking for 3D Point Cloud Representation Learning
    Szachniewicz, Michal
    Kozlowski, Wojciech
    Stypulkowski, Michal
    Zieba, Maciej
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, ACIIDS 2024, 2024, 14796 : 156 - 168
  • [34] Self-supervised indoor scene point cloud completion from a single panorama
    Li, Tong
    Zhang, Zhaoxuan
    Wang, Yuxin
    Cui, Yan
    Li, Yuqi
    Zhou, Dongsheng
    Yin, Baocai
    Yang, Xin
    VISUAL COMPUTER, 2025, 41 (03): : 1891 - 1905
  • [35] Spatiotemporal Self-supervised Learning for Point Clouds in the Wild
    Wu, Yanhao
    Zhang, Tong
    Ke, Wei
    Susstrunk, Sabine
    Salzmann, Mathieu
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5251 - 5260
  • [36] Self-Supervised Learning for Domain Adaptation on Point Clouds
    Achituve, Idan
    Maron, Haggai
    Chechik, Gal
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 123 - 133
  • [37] Point2Vec for Self-supervised Representation Learning on Point Clouds
    Abou Zeid, Karim
    Schult, Jonas
    Hermans, Alexander
    Leibe, Bastian
    PATTERN RECOGNITION, DAGM GCPR 2023, 2024, 14264 : 131 - 146
  • [38] Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds
    Hess, Georg
    Jaxing, Johan
    Svensson, Elias
    Hagerman, David
    Petersson, Christoffer
    Svensson, Lennart
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW), 2023, : 350 - 359
  • [39] A Cross Branch Fusion-Based Contrastive Learning Framework for Point Cloud Self-supervised Learning
    Wu, Chengzhi
    Huang, Qianliang
    Jin, Kun
    Pfrommer, Julius
    Beyerer, Juergen
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 528 - 538
  • [40] Improved Point Transformation Methods For Self-Supervised Depth Prediction
    Chen, Ziwen
    Guo, Zixuan
    Weinman, Jerod
    2021 18TH CONFERENCE ON ROBOTS AND VISION (CRV 2021), 2021, : 111 - 118