Combine multi-order representation learning and frame optimization learning for skeleton-based action recognition

被引:0
|
作者
Nong, Liping [1 ,3 ,5 ]
Huang, Zhuocheng [2 ,4 ]
Wang, Junyi [1 ]
Rong, Yanpeng [2 ,4 ]
Peng, Jie [1 ]
Huang, Yiping [2 ,4 ]
机构
[1] Guilin Univ Elect Technol, Sch Informat & Commun, Guilin 541004, Peoples R China
[2] Guangxi Normal Univ, Sch Elect & Informat Engn, Guangxi Key Lab Brain inspired Comp & Intelligent, Guilin 541004, Peoples R China
[3] Guilin Univ Elect Technol, Key Lab Cognit Radio & Informat Proc, Minist Educ, Guilin 541004, Peoples R China
[4] Guangxi Normal Univ, Educ Dept Guangxi Zhuang Autonomous Reg, Key Lab Integrated Circuits & Microsyst, Guilin 541004, Peoples R China
[5] Guangxi Normal Univ, Coll Phys & Technol, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Skeleton-based action recognition; Graph convolutional network; Hypergraph convolutional network; Frame optimization learning;
D O I
10.1016/j.dsp.2024.104823
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Skeleton-based action recognition has broad application prospects in many fields such as virtual reality. Currently, the most popular way is to employ Graph Convolutional Networks (GCNs) or Hypergraph Convolutional Networks (HGCNs) for this task. However, GCN-based methods may heavily rely on the physical connectivity relationship between joints while lack the capture of higher-order information about interactions among distant joints, and HGCN-based methods usually introduce unnecessary noise when capturing low-order information of skeleton structures with simple topology. Besides, the current methods do not deal well with redundant frames and confusing frames. These limitations hinder the improvement of recognition accuracy. In this paper, we propose a novel network, called Hyper-Net, which combines multi-order representation learning and frame optimization learning for skeleton-based action recognition. Specifically, the proposed Hyper-Net contains Temporal-Channel Aggregation Graph Convolution (TCA-GC), Spatial-Temporal Aggregation Hypergraph Convolution (STA-HC) and Frame Optimization Learning (F-OL) modules. The TCA-GC aggregates low-order and local information from simple joint and bone topologies across different temporal and channel dimensions. The STA-HC captures high- order and global information from complex motion streams as well as solving the problem of spatial-temporal weight imbalance. The F-OL can adaptively extract key frames and distinguish confusing frames, thus improving the ability of the network to recognize confusing actions. A large number of experiments are conducted on the NTU RGB+D, NTU RGB+D 120 and NW-UCLA datasets for action recognition task. Experimental results demonstrate the superiority and effectiveness of the proposed network.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Asymmetric information-regularized learning for skeleton-based action recognition
    Wu, Kunlun
    Gong, Xun
    APPLIED INTELLIGENCE, 2023, 53 (24) : 31077 - 31105
  • [32] Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning
    Si, Chenyang
    Jing, Ya
    Wang, Wei
    Wang, Liang
    Tan, Tieniu
    COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 106 - 121
  • [33] Zero-Shot Learning for Skeleton-based Classroom Action Recognition
    Shi, Bin
    Wang, Luyang
    Yu, Zefang
    Xiang, Suncheng
    Liu, Ting
    Fu, Yuzhuo
    2021 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROLS (ISCSIC 2021), 2021, : 82 - 86
  • [34] Asymmetric information-regularized learning for skeleton-based action recognition
    Kunlun Wu
    Xun Gong
    Applied Intelligence, 2023, 53 : 31065 - 31076
  • [35] Multi-Dimensional Dynamic Topology Learning Graph Convolution for Skeleton-Based Action Recognition
    Luo H.-L.
    Cao L.-J.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (03): : 991 - 1001
  • [36] Multi-source Learning for Skeleton-based Action Recognition Using Deep LSTM Networks
    Cui, Ran
    Zhu, Aichun
    Zhang, Sai
    Hua, Gang
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 547 - 552
  • [37] Adaptive multi-level graph convolution with contrastive learning for skeleton-based action recognition
    Geng, Pei
    Li, Haowei
    Wang, Fuyun
    Lyu, Lei
    SIGNAL PROCESSING, 2022, 201
  • [38] Skeleton-based Human Action Recognition A Learning Method based on Active Joints
    Tehrani, Ahmad K. N.
    Aghbolaghi, Maryam Asadi
    Kasaei, Shohreh
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 5, 2017, : 303 - 310
  • [39] View-independent representation with frame interpolation method for skeleton-based human action recognition
    Jiang, Yingguo
    Xu, Jun
    Zhang, Tong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (12) : 2625 - 2636
  • [40] View-independent representation with frame interpolation method for skeleton-based human action recognition
    Yingguo Jiang
    Jun Xu
    Tong Zhang
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 2625 - 2636