Combine multi-order representation learning and frame optimization learning for skeleton-based action recognition

被引:0
|
作者
Nong, Liping [1 ,3 ,5 ]
Huang, Zhuocheng [2 ,4 ]
Wang, Junyi [1 ]
Rong, Yanpeng [2 ,4 ]
Peng, Jie [1 ]
Huang, Yiping [2 ,4 ]
机构
[1] Guilin Univ Elect Technol, Sch Informat & Commun, Guilin 541004, Peoples R China
[2] Guangxi Normal Univ, Sch Elect & Informat Engn, Guangxi Key Lab Brain inspired Comp & Intelligent, Guilin 541004, Peoples R China
[3] Guilin Univ Elect Technol, Key Lab Cognit Radio & Informat Proc, Minist Educ, Guilin 541004, Peoples R China
[4] Guangxi Normal Univ, Educ Dept Guangxi Zhuang Autonomous Reg, Key Lab Integrated Circuits & Microsyst, Guilin 541004, Peoples R China
[5] Guangxi Normal Univ, Coll Phys & Technol, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Skeleton-based action recognition; Graph convolutional network; Hypergraph convolutional network; Frame optimization learning;
D O I
10.1016/j.dsp.2024.104823
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Skeleton-based action recognition has broad application prospects in many fields such as virtual reality. Currently, the most popular way is to employ Graph Convolutional Networks (GCNs) or Hypergraph Convolutional Networks (HGCNs) for this task. However, GCN-based methods may heavily rely on the physical connectivity relationship between joints while lack the capture of higher-order information about interactions among distant joints, and HGCN-based methods usually introduce unnecessary noise when capturing low-order information of skeleton structures with simple topology. Besides, the current methods do not deal well with redundant frames and confusing frames. These limitations hinder the improvement of recognition accuracy. In this paper, we propose a novel network, called Hyper-Net, which combines multi-order representation learning and frame optimization learning for skeleton-based action recognition. Specifically, the proposed Hyper-Net contains Temporal-Channel Aggregation Graph Convolution (TCA-GC), Spatial-Temporal Aggregation Hypergraph Convolution (STA-HC) and Frame Optimization Learning (F-OL) modules. The TCA-GC aggregates low-order and local information from simple joint and bone topologies across different temporal and channel dimensions. The STA-HC captures high- order and global information from complex motion streams as well as solving the problem of spatial-temporal weight imbalance. The F-OL can adaptively extract key frames and distinguish confusing frames, thus improving the ability of the network to recognize confusing actions. A large number of experiments are conducted on the NTU RGB+D, NTU RGB+D 120 and NW-UCLA datasets for action recognition task. Experimental results demonstrate the superiority and effectiveness of the proposed network.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] SkelResNet: Transfer Learning Approach for Skeleton-Based Action Recognition
    Kilic, Ugur
    Karadag, Ozge Oztimur
    Ozyer, Gulsah Tumuklu
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [22] Key Frame Selection for Temporal Graph Optimization of Skeleton-Based Action Recognition
    Hou, Jingyi
    Su, Lei
    Zhao, Yan
    APPLIED SCIENCES-BASEL, 2024, 14 (21):
  • [23] Contrast-Reconstruction Representation Learning for Self-Supervised Skeleton-Based Action Recognition
    Wang, Peng
    Wen, Jun
    Si, Chenyang
    Qian, Yuntao
    Wang, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6224 - 6238
  • [24] InfoGCN plus plus : Learning Representation by Predicting the Future for Online Skeleton-Based Action Recognition
    Chi, Seunggeun
    Chi, Hyung-Gun
    Huang, Qixing
    Ramani, Karthik
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (01) : 514 - 528
  • [25] Skeleton-based Action Recognition Based on Deep Learning and Grassmannian Pyramids
    Konstantinidis, Dimitrios
    Dimitropoulos, Kosmas
    Daras, Petros
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2045 - 2049
  • [26] Multi-Granularity Anchor-Contrastive Representation Learning for Semi-Supervised Skeleton-Based Action Recognition
    Shu, Xiangbo
    Xu, Binqian
    Zhang, Liyan
    Tang, Jinhui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7559 - 7576
  • [27] JointContrast: Skeleton-Based Interaction Recognition with New Representation and Contrastive Learning
    Zhang, Ji
    Jia, Xiangze
    Wang, Zhen
    Luo, Yonglong
    Chen, Fulong
    Yang, Gaoming
    Zhao, Lihui
    ALGORITHMS, 2023, 16 (04)
  • [28] AL-SAR: Active Learning for Skeleton-Based Action Recognition
    Li, Jingyuan
    Le, Trung
    Shlizerman, Eli
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16966 - 16974
  • [29] Adaptive Feature Selection With Reinforcement Learning for Skeleton-Based Action Recognition
    Xu, Zheyuan
    Wang, Yingfu
    Jiang, Jiaqin
    Yao, Jian
    Li, Liang
    IEEE ACCESS, 2020, 8 : 213038 - 213051
  • [30] Skeleton MixFormer: Multivariate Topology Representation for Skeleton-based Action Recognition
    Xin, Wentian
    Miao, Qiguang
    Liu, Yi
    Liu, Ruyi
    Pun, Chi-Man
    Shi, Cheng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2211 - 2220