Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding

被引：2

作者：

Jiang, Li ^{[1
]}

Yang, Zetong ^{[2
]}

Shi, Shaoshuai ^{[1
]}

Golyanik, Vladislav ^{[1
]}

Dai, Dengxin ^{[1
]}

Schiele, Bernt ^{[1
]}

机构：

[1] Saarland Informatics Campus, Max Planck Inst Informat, Saarbrucken, Germany

[2] CUHK, Hong Kong, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.00119

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Masked signal modeling has greatly advanced self-supervised pre-training for language and 2D images. However, it is still not fully explored in 3D scene understanding. Thus, this paper introduces Masked Shape Prediction (MSP), a new framework to conduct masked signal modeling in 3D scenes. MSP uses the essential 3D semantic cue, i.e., geometric shape, as the prediction target for masked points. The context-enhanced shape target consisting of explicit shape context and implicit deep shape feature is proposed to facilitate exploiting contextual cues in shape prediction. Meanwhile, the pre-training architecture in MSP is carefully designed to alleviate the masked shape leakage from point coordinates. Experiments on multiple 3D understanding tasks on both indoor and outdoor datasets demonstrate the effectiveness of MSP in learning good feature representations to consistently boost downstream performance.

引用

页码：1168 / 1178

页数：11

共 50 条

[31] UniVIP: A Unified Framework for Self-Supervised Visual Pre-training
Li, Zhaowen
Zhu, Yousong
Yang, Fan
Li, Wei
Zhao, Chaoyang
Chen, Yingying
Chen, Zhiyang
Xie, Jiahao
Wu, Liwei
Zhao, Rui
Tang, Ming
Wang, Jinqiao
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14607 - 14616
[32] Representation Recovering for Self-Supervised Pre-training on Medical Images
Yan, Xiangyi
Naushad, Junayed
Sun, Shanlin
Han, Kun
Tang, Hao
Kong, Deying
Ma, Haoyu
You, Chenyu
Xie, Xiaohui
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2684 - 2694
[33] Reducing Domain mismatch in Self-supervised speech pre-training
Baskar, Murali Karthick
Rosenberg, Andrew
Ramabhadran, Bhuvana
Zhang, Yu
INTERSPEECH 2022, 2022, : 3028 - 3032
[34] Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Wang, Xinlong
Zhang, Rufeng
Shen, Chunhua
Kong, Tao
Li, Lei
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3023 - 3032
[35] Self-supervised VICReg pre-training for Brugada ECG detection
Ronan, Robert
Tarabanis, Constantine
Chinitz, Larry
Jankelson, Lior
SCIENTIFIC REPORTS, 2025, 15 (01):
[36] A Self-Supervised Pre-Training Method for Chinese Spelling Correction
Su J.
Yu S.
Hong X.
Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2023, 51 (09): : 90 - 98
[37] Self-supervised pre-training on industrial time-series
Biggio, Luca
Kastanis, Iason
2021 8TH SWISS CONFERENCE ON DATA SCIENCE, SDS, 2021, : 56 - 57
[38] DiT: Self-supervised Pre-training for Document Image Transformer
Li, Junlong
Xu, Yiheng
Lv, Tengchao
Cui, Lei
Zhang, Cha
Wei, Furu
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3530 - 3539
[39] FALL DETECTION USING SELF-SUPERVISED PRE-TRAINING MODEL
Yhdego, Haben
Audette, Michel
Paolini, Christopher
PROCEEDINGS OF THE 2022 ANNUAL MODELING AND SIMULATION CONFERENCE (ANNSIM'22), 2022, : 361 - 371
[40] CDS: Cross-Domain Self-supervised Pre-training
Kim, Donghyun
Saito, Kuniaki
Oh, Tae-Hyun
Plummer, Bryan A.
Sclaroff, Stan
Saenko, Kate
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9103 - 9112

← 1 2 3 4 5 →