Combine multi-order representation learning and frame optimization learning for skeleton-based action recognition

被引：0

作者：

Nong, Liping ^{[1
,3
,5
]}

Huang, Zhuocheng ^{[2
,4
]}

Wang, Junyi ^{[1
]}

Rong, Yanpeng ^{[2
,4
]}

Peng, Jie ^{[1
]}

Huang, Yiping ^{[2
,4
]}

机构：

[1] Guilin Univ Elect Technol, Sch Informat & Commun, Guilin 541004, Peoples R China

[2] Guangxi Normal Univ, Sch Elect & Informat Engn, Guangxi Key Lab Brain inspired Comp & Intelligent, Guilin 541004, Peoples R China

[3] Guilin Univ Elect Technol, Key Lab Cognit Radio & Informat Proc, Minist Educ, Guilin 541004, Peoples R China

[4] Guangxi Normal Univ, Educ Dept Guangxi Zhuang Autonomous Reg, Key Lab Integrated Circuits & Microsyst, Guilin 541004, Peoples R China

[5] Guangxi Normal Univ, Coll Phys & Technol, Guilin 541004, Peoples R China

来源：

DIGITAL SIGNAL PROCESSING | 2025年 / 156卷

基金：

中国国家自然科学基金;

关键词：

Skeleton-based action recognition; Graph convolutional network; Hypergraph convolutional network; Frame optimization learning;

D O I：

10.1016/j.dsp.2024.104823

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Skeleton-based action recognition has broad application prospects in many fields such as virtual reality. Currently, the most popular way is to employ Graph Convolutional Networks (GCNs) or Hypergraph Convolutional Networks (HGCNs) for this task. However, GCN-based methods may heavily rely on the physical connectivity relationship between joints while lack the capture of higher-order information about interactions among distant joints, and HGCN-based methods usually introduce unnecessary noise when capturing low-order information of skeleton structures with simple topology. Besides, the current methods do not deal well with redundant frames and confusing frames. These limitations hinder the improvement of recognition accuracy. In this paper, we propose a novel network, called Hyper-Net, which combines multi-order representation learning and frame optimization learning for skeleton-based action recognition. Specifically, the proposed Hyper-Net contains Temporal-Channel Aggregation Graph Convolution (TCA-GC), Spatial-Temporal Aggregation Hypergraph Convolution (STA-HC) and Frame Optimization Learning (F-OL) modules. The TCA-GC aggregates low-order and local information from simple joint and bone topologies across different temporal and channel dimensions. The STA-HC captures high- order and global information from complex motion streams as well as solving the problem of spatial-temporal weight imbalance. The F-OL can adaptively extract key frames and distinguish confusing frames, thus improving the ability of the network to recognize confusing actions. A large number of experiments are conducted on the NTU RGB+D, NTU RGB+D 120 and NW-UCLA datasets for action recognition task. Experimental results demonstrate the superiority and effectiveness of the proposed network.

引用

页数：12

共 50 条

[1] Bootstrapped Representation Learning for Skeleton-Based Action Recognition
Moliner, Olivier
Huang, Sangxia
Astrom, Kalle
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4153 - 4163
[2] InfoGCN: Representation Learning for Human Skeleton-based Action Recognition
Chi, Hyung-gun
Ha, Myoung Hoon
Chi, Seunggeun
Lee, Sang Wan
Huang, Qixing
Ramani, Karthik
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20154 - 20164
[3] Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
Lin, Lilang
Wu, Lehong
Zhang, Jiahang
Wang, Jiaying
COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 75 - 92
[4] Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition
Du, Yong
Fu, Yun
Wang, Liang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3010 - 3022
[5] Adaptive Spatiotemporal Representation Learning for Skeleton-Based Human Action Recognition
Yu, Jiahui
Gao, Hongwei
Chen, Yongquan
Zhou, Dalin
Liu, Jinguo
Ju, Zhaojie
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (04) : 1654 - 1665
[6] Representation modeling learning with multi-domain decoupling for unsupervised skeleton-based action recognition
He, Zhiquan
Lv, Jiantu
Fang, Shizhang
NEUROCOMPUTING, 2024, 582
[7] Robust Multi-Feature Learning for Skeleton-Based Action Recognition
Wang, Yingfu
Xu, Zheyuan
Li, Li
Yao, Jian
IEEE ACCESS, 2019, 7 : 148658 - 148671
[8] Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition
Liu, Hongda
Wang, Yunlong
Ren, Min
Hu, Junxing
Luo, Zhengquan
Hou, Guangqi
Sun, Zhenan
MACHINE INTELLIGENCE RESEARCH, 2025,
[9] Decoupled Representation Learning for Skeleton-Based Gesture Recognition
Liu, Jianbo
Liu, Yongcheng
Wang, Ying
Prinet, Veronique
Xiang, Shiming
Pan, Chunhong
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5750 - 5759
[10] Skeleton-based action recognition with extreme learning machines
Chen, Xi
Koskela, Markus
NEUROCOMPUTING, 2015, 149 : 387 - 396

← 1 2 3 4 5 →