OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning

被引:0
|
作者
Usman, Muhammad [1 ,2 ,3 ]
Cao, Wenming [1 ,2 ,3 ]
Huang, Zhao [4 ]
Zhong, Jianqi [1 ,2 ,3 ]
Ji, Ruiya [5 ]
机构
[1] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen 518060, Peoples R China
[2] Guangdong Key Lab Intelligent Informat Proc, Shenzhen 518060, Peoples R China
[3] Shenzhen Univ, Shenzhen 518060, Peoples R China
[4] Northumbria Univ, Dept Comp & Informat Sci, Newcastle NE1 8ST, England
[5] Queen Mary Univ London, Dept Comp Sci, London E1 4NS, England
基金
中国国家自然科学基金;
关键词
skeleton-based action representation learning; unsupervised learning; hierarchical contrastive learning; one-to-many; GRAPH CONVOLUTIONAL NETWORKS; LSTM;
D O I
10.3390/ai5040106
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human action recognition has become crucial in computer vision, with growing applications in surveillance, human-computer interaction, and healthcare. Traditional approaches often use broad feature representations, which may miss subtle variations in timing and movement within action sequences. Our proposed One-to-Many Hierarchical Contrastive Learning (OTM-HC) framework maps the input into multi-layered feature vectors, creating a hierarchical contrast representation that captures various granularities within a human skeleton sequence temporal and spatial domains. Using sequence-to-sequence (Seq2Seq) transformer encoders and downsampling modules, OTM-HC can distinguish between multiple levels of action representations, such as instance, domain, clip, and part levels. Each level contributes significantly to a comprehensive understanding of action representations. The OTM-HC model design is adaptable, ensuring smooth integration with advanced Seq2Seq encoders. We tested the OTM-HC framework across four datasets, demonstrating improved performance over state-of-the-art models. Specifically, OTM-HC achieved improvements of 0.9% and 0.6% on NTU60, 0.4% and 0.7% on NTU120, and 0.7% and 0.3% on PKU-MMD I and II, respectively, surpassing previous leading approaches across these datasets. These results showcase the robustness and adaptability of our model for various skeleton-based action recognition tasks.
引用
收藏
页码:2170 / 2186
页数:17
相关论文
共 48 条
  • [1] EnsCLR: Unsupervised skeleton-based action recognition via ensemble contrastive learning of representation
    Wang, Kun
    Cao, Jiuxin
    Cao, Biwei
    Liu, Bo
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 247
  • [2] Hierarchical Contrast for Unsupervised Skeleton-Based Action Representation Learning
    Dong, Jianfeng
    Sun, Shengkai
    Liu, Zhonglin
    Chen, Shujie
    Liu, Baolong
    Wang, Xun
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 525 - 533
  • [3] Hierarchical Consistent Contrastive Learning for Skeleton-Based Action Recognition with Growing Augmentations
    Zhang, Jiahang
    Lin, Lilang
    Liu, Jiaying
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3427 - 3435
  • [4] Global-local contrastive multiview representation learning for skeleton-based action
    Bian, Cunling
    Feng, Wei
    Meng, Fanbo
    Wang, Song
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 229
  • [5] JointContrast: Skeleton-Based Mutual Action Recognition with Contrastive Learning
    Jia, Xiangze
    Zhang, Ji
    Wang, Zhen
    Luo, Yonglong
    Chen, Fulong
    Xiao, Jing
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 478 - 489
  • [6] Unsupervised skeleton-based action representation learning via relation consistency pursuit
    Wenjing Zhang
    Yonghong Hou
    Haoyuan Zhang
    Neural Computing and Applications, 2022, 34 : 20327 - 20339
  • [7] Bootstrapped Representation Learning for Skeleton-Based Action Recognition
    Moliner, Olivier
    Huang, Sangxia
    Astrom, Kalle
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4153 - 4163
  • [8] Unsupervised skeleton-based action representation learning via relation consistency pursuit
    Zhang, Wenjing
    Hou, Yonghong
    Zhang, Haoyuan
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (22): : 20327 - 20339
  • [9] JointContrast: Skeleton-Based Interaction Recognition with New Representation and Contrastive Learning
    Zhang, Ji
    Jia, Xiangze
    Wang, Zhen
    Luo, Yonglong
    Chen, Fulong
    Yang, Gaoming
    Zhao, Lihui
    ALGORITHMS, 2023, 16 (04)
  • [10] InfoGCN: Representation Learning for Human Skeleton-based Action Recognition
    Chi, Hyung-gun
    Ha, Myoung Hoon
    Chi, Seunggeun
    Lee, Sang Wan
    Huang, Qixing
    Ramani, Karthik
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20154 - 20164