Modeling the Uncertainty for Self-supervised 3D Skeleton Action Representation Learning

被引:19
|
作者
Su, Yukun [1 ]
Lin, Guosheng [2 ]
Sun, Ruizhou [1 ]
Hao, Yun [1 ]
Wu, Qingyao [1 ]
机构
[1] South China Univ Technol, Guangzhou, Peoples R China
[2] Nanyang Technol Univ, Singapore, Singapore
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
self-supervised; 3D skeleton action; uncertainty; probabilistic embedding; space;
D O I
10.1145/3474085.3475248
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised learning (SSL) has been proved very effective in learning representations from unlabeled data in language and vision domains. Yet, very few instrumental self-supervised approaches exist for 3D skeleton action understanding, and directly applying the existing SSL methods from other domains for skeleton action learning may suffer from misalignment of representations and some limitations. In this paper, we consider that a good representation learning encoder can distinguish the underlying features of different actions, which can make the similar motions closer while pushing the dissimilar motions away. There exists, however, some uncertainties in the skeleton actions due to the inherent ambiguity of 3D skeleton pose in different viewpoints or the sampling algorithm in contrastive learning, thus, it is ill-posed to differentiate the action features in the deterministic embedding space. To address these issues, we rethink the distance between action features and propose to model each action representation into the probabilistic embedding space to alleviate the uncertainties upon encountering the ambiguous 3D skeleton inputs. To validate the effectiveness of the proposed method, extensive experiments are conducted on Kinetics, NTU60, NTU120, and PKUMMD datasets with several alternative network architectures. Experimental evaluations demonstrate the superiority of our approach and through which, we can gain significant performance improvement without using extra labeled data.
引用
收藏
页码:769 / 778
页数:10
相关论文
共 50 条
  • [21] Self-Supervised 3D Behavior Representation Learning Based on Homotopic Hyperbolic Embedding
    Chen, Jinghong
    Jin, Zhihao
    Wang, Qicong
    Meng, Hongying
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6061 - 6074
  • [22] Self-Supervised 3D Behavior Representation Learning Based on Homotopic Hyperbolic Embedding
    Chen, Jinghong
    Jin, Zhihao
    Wang, Qicong
    Meng, Hongying
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6061 - 6074
  • [23] Mutual information guided 3D ResNet for self-supervised video representation learning
    Xue, Fei
    Ji, Hongbing
    Zhang, Wenbo
    IET IMAGE PROCESSING, 2020, 14 (13) : 3066 - 3075
  • [24] Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds
    Huang, Siyuan
    Degrees, Yichen Xie
    Zhu, Song-Chun
    Zhu, Yixin
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6515 - 6525
  • [25] Trusted 3D self-supervised representation learning with cross-modal settings
    Han, Xu
    Cheng, Haozhe
    Shi, Pengcheng
    Zhu, Jihua
    MACHINE VISION AND APPLICATIONS, 2024, 35 (04)
  • [26] Motion Guided Attention Learning for Self-Supervised 3D Human Action Recognition
    Yang, Yang
    Liu, Guangjun
    Gao, Xuehao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8623 - 8634
  • [27] How and What to Learn: Taxonomizing Self-Supervised Learning for 3D Action Recognition
    Ben Tanfous, Amor
    Zerroug, Aimen
    Linsley, Drew
    Serre, Thomas
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2888 - 2897
  • [28] Attention-guided mask learning for self-supervised 3D action recognition
    Zhang, Haoyuan
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (06) : 7487 - 7496
  • [29] Skeleton-Contrastive 3D Action Representation Learning
    Thoker, Fida Mohammad
    Doughty, Hazel
    Snoek, Cees G. M.
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1655 - 1663
  • [30] Uncertainty-aware Self-supervised 3D Data Association
    Wang, Jianren
    Ancha, Siddharth
    Chen, Yi-Ting
    Held, David
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 8125 - 8132