OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning

被引：0

作者：

Usman, Muhammad ^{[1
,2
,3
]}

Cao, Wenming ^{[1
,2
,3
]}

Huang, Zhao ^{[4
]}

Zhong, Jianqi ^{[1
,2
,3
]}

Ji, Ruiya ^{[5
]}

机构：

[1] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen 518060, Peoples R China

[2] Guangdong Key Lab Intelligent Informat Proc, Shenzhen 518060, Peoples R China

[3] Shenzhen Univ, Shenzhen 518060, Peoples R China

[4] Northumbria Univ, Dept Comp & Informat Sci, Newcastle NE1 8ST, England

[5] Queen Mary Univ London, Dept Comp Sci, London E1 4NS, England

来源：

AI | 2024年 / 5卷 / 04期

基金：

中国国家自然科学基金;

关键词：

skeleton-based action representation learning; unsupervised learning; hierarchical contrastive learning; one-to-many; GRAPH CONVOLUTIONAL NETWORKS; LSTM;

D O I：

10.3390/ai5040106

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human action recognition has become crucial in computer vision, with growing applications in surveillance, human-computer interaction, and healthcare. Traditional approaches often use broad feature representations, which may miss subtle variations in timing and movement within action sequences. Our proposed One-to-Many Hierarchical Contrastive Learning (OTM-HC) framework maps the input into multi-layered feature vectors, creating a hierarchical contrast representation that captures various granularities within a human skeleton sequence temporal and spatial domains. Using sequence-to-sequence (Seq2Seq) transformer encoders and downsampling modules, OTM-HC can distinguish between multiple levels of action representations, such as instance, domain, clip, and part levels. Each level contributes significantly to a comprehensive understanding of action representations. The OTM-HC model design is adaptable, ensuring smooth integration with advanced Seq2Seq encoders. We tested the OTM-HC framework across four datasets, demonstrating improved performance over state-of-the-art models. Specifically, OTM-HC achieved improvements of 0.9% and 0.6% on NTU60, 0.4% and 0.7% on NTU120, and 0.7% and 0.3% on PKU-MMD I and II, respectively, surpassing previous leading approaches across these datasets. These results showcase the robustness and adaptability of our model for various skeleton-based action recognition tasks.

引用

页码：2170 / 2186

页数：17

共 48 条

[1] EnsCLR: Unsupervised skeleton-based action recognition via ensemble contrastive learning of representation
Wang, Kun
Cao, Jiuxin
Cao, Biwei
Liu, Bo
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 247
[2] Hierarchical Contrast for Unsupervised Skeleton-Based Action Representation Learning
Dong, Jianfeng
Sun, Shengkai
Liu, Zhonglin
Chen, Shujie
Liu, Baolong
Wang, Xun
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 525 - 533
[3] Hierarchical Consistent Contrastive Learning for Skeleton-Based Action Recognition with Growing Augmentations
Zhang, Jiahang
Lin, Lilang
Liu, Jiaying
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3427 - 3435
[4] Global-local contrastive multiview representation learning for skeleton-based action
Bian, Cunling
Feng, Wei
Meng, Fanbo
Wang, Song
COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 229
[5] JointContrast: Skeleton-Based Mutual Action Recognition with Contrastive Learning
Jia, Xiangze
Zhang, Ji
Wang, Zhen
Luo, Yonglong
Chen, Fulong
Xiao, Jing
PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 478 - 489
[6] Unsupervised skeleton-based action representation learning via relation consistency pursuit
Wenjing Zhang
Yonghong Hou
Haoyuan Zhang
Neural Computing and Applications, 2022, 34 : 20327 - 20339
[7] Bootstrapped Representation Learning for Skeleton-Based Action Recognition
Moliner, Olivier
Huang, Sangxia
Astrom, Kalle
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4153 - 4163
[8] Unsupervised skeleton-based action representation learning via relation consistency pursuit
Zhang, Wenjing
Hou, Yonghong
Zhang, Haoyuan
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (22): : 20327 - 20339
[9] JointContrast: Skeleton-Based Interaction Recognition with New Representation and Contrastive Learning
Zhang, Ji
Jia, Xiangze
Wang, Zhen
Luo, Yonglong
Chen, Fulong
Yang, Gaoming
Zhao, Lihui
ALGORITHMS, 2023, 16 (04)
[10] InfoGCN: Representation Learning for Human Skeleton-based Action Recognition
Chi, Hyung-gun
Ha, Myoung Hoon
Chi, Seunggeun
Lee, Sang Wan
Huang, Qixing
Ramani, Karthik
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20154 - 20164

← 1 2 3 4 5 →