Modeling the Uncertainty for Self-supervised 3D Skeleton Action Representation Learning

被引：19

作者：

Su, Yukun ^{[1
]}

Lin, Guosheng ^{[2
]}

Sun, Ruizhou ^{[1
]}

Hao, Yun ^{[1
]}

Wu, Qingyao ^{[1
]}

机构：

[1] South China Univ Technol, Guangzhou, Peoples R China

[2] Nanyang Technol Univ, Singapore, Singapore

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年

基金：

新加坡国家研究基金会; 中国国家自然科学基金;

关键词：

self-supervised; 3D skeleton action; uncertainty; probabilistic embedding; space;

D O I：

10.1145/3474085.3475248

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Self-supervised learning (SSL) has been proved very effective in learning representations from unlabeled data in language and vision domains. Yet, very few instrumental self-supervised approaches exist for 3D skeleton action understanding, and directly applying the existing SSL methods from other domains for skeleton action learning may suffer from misalignment of representations and some limitations. In this paper, we consider that a good representation learning encoder can distinguish the underlying features of different actions, which can make the similar motions closer while pushing the dissimilar motions away. There exists, however, some uncertainties in the skeleton actions due to the inherent ambiguity of 3D skeleton pose in different viewpoints or the sampling algorithm in contrastive learning, thus, it is ill-posed to differentiate the action features in the deterministic embedding space. To address these issues, we rethink the distance between action features and propose to model each action representation into the probabilistic embedding space to alleviate the uncertainties upon encountering the ambiguous 3D skeleton inputs. To validate the effectiveness of the proposed method, extensive experiments are conducted on Kinetics, NTU60, NTU120, and PKUMMD datasets with several alternative network architectures. Experimental evaluations demonstrate the superiority of our approach and through which, we can gain significant performance improvement without using extra labeled data.

引用

页码：769 / 778

页数：10

共 50 条

[41] Siamese Image Modeling for Self-Supervised Vision Representation Learning
Tao, Chenxin
Zhu, Xizhou
Su, Weijie
Huang, Gao
Li, Bin
Zhou, Jie
Qiao, Yu
Wang, Xiaogang
Dai, Jifeng
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2132 - 2141
[42] Modeling the skeleton-language uncertainty for 3D action recognition
Wang, Mingdao
Zhang, Xianlin
Chen, Siqi
Li, Xueming
Zhang, Yue
NEUROCOMPUTING, 2024, 608
[43] SegContrast: 3D Point Cloud Feature Representation Learning Through Self-Supervised Segment Discrimination
Nunes, Lucas
Marcuzzi, Rodrigo
Chen, Xieyuanli
Behley, Jens
Stachniss, Cyrill
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 2116 - 2123
[44] Self-supervised Learning of Morphological Representation for 3D EM Segments with Cluster-Instance Correlations
Zhang, Chi
Chen, Qihua
Chen, Xuejin
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VIII, 2022, 13438 : 99 - 108
[45] Whitening for Self-Supervised Representation Learning
Ermolov, Aleksandr
Siarohin, Aliaksandr
Sangineto, Enver
Sebe, Nicu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[46] Self-Supervised Representation Learning for CAD
Jones, Benjamin T.
Hu, Michael
Kodnongbua, Milin
Kim, Vladimir G.
Schulz, Adriana
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21327 - 21336
[47] Skeleton Cloud Colorization for Unsupervised 3D Action Representation Learning
Yang, Siyuan
Liu, Jun
Lu, Shijian
Er, Meng Hwa
Kot, Alex C.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13403 - 13413
[48] Spatiotemporal consistency enhancement self-supervised representation learning for action recognition
Bi, Shuai
Hu, Zhengping
Zhao, Mengyao
Li, Shufang
Sun, Zhe
SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 1485 - 1492
[49] Exploring Self-Supervised Learning for 3D Point Cloud Registration
Yuan, Mingzhi
Huang, Qiao
Shen, Ao
Huang, Xiaoshui
Wang, Manning
IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (01): : 25 - 31
[50] Spatiotemporal consistency enhancement self-supervised representation learning for action recognition
Shuai Bi
Zhengping Hu
Mengyao Zhao
Shufang Li
Zhe Sun
Signal, Image and Video Processing, 2023, 17 : 1485 - 1492

← 1 2 3 4 5 →