Prompt-supervised dynamic attention graph convolutional network for skeleton-based action recognition

被引:0
|
作者
Zhu, Shasha [1 ]
Sun, Lu [1 ]
Ma, Zeyuan [1 ]
Li, Chenxi [1 ]
He, Dongzhi [1 ]
机构
[1] Beijing Univ Technol, Coll Comp Sci, Beijing, Peoples R China
关键词
Skeleton-based action recognition; Graph convolutional network; Attention mechanism; Dynamic convolution; Prompt learning;
D O I
10.1016/j.neucom.2024.128623
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skeleton-based action recognition is a core task in the field of video understanding. Skeleton sequences are characterized by high information density, low redundancy, and clear structural information, thereby facilitating the analysis of complex relationships among human behaviors more readily than other modalities. Although existing studies have encoded skeleton data and achieved positive outcomes, they have often overlooked the precise high-level semantic information inherent in the action descriptions. To address this issue, this paper proposes a prompt-supervised dynamic attention graph convolutional network (PDA-GCN). Specifically, the PDA-GCN incorporates a prompt supervision (PS) module that leverages a pre-trained large-scale language model (LLM) as a knowledge engine and retains the generated text features as prompts to provide additional supervision during model training, enhancing the model's ability to discern analogous actions with negligible computational cost. In addition, for the purpose of bolstering the learning of discriminative features, a dynamic attention graph convolution (DA-GC) module is presented. This module utilizes self-attention mechanism to adaptively infer intrinsic relationships between joints and integrates dynamic convolution to strengthen the emphasis on local information. This dual focus on both global context and local details further amplifies the efficiency and effectiveness of the model. Extensive experiments, conducted on the widely-used skeleton-based action recognition datasets NTU RGB+D 60 and NTU RGB+D 120, demonstrate that the PDA-GCN surpasses known state-of-the-art methods, achieving accuracies of 93.4% on the NTU RGB+D 60 cross-subject split and 90.7% on the NTU RGB+D 120 cross-subject split.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] A Graph Convolutional Network with Early Attention Module for Skeleton-based Action Prediction
    Liu, Cuiwei
    Zhao, Xiaoxue
    Yan, Zhuo
    Jiang, Youzhi
    Shi, Xiangbin
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1266 - 1272
  • [32] Triplet attention multiple spacetime-semantic graph convolutional network for skeleton-based action recognition
    Yanjing Sun
    Han Huang
    Xiao Yun
    Bin Yang
    Kaiwen Dong
    Applied Intelligence, 2022, 52 : 113 - 126
  • [33] Triplet attention multiple spacetime-semantic graph convolutional network for skeleton-based action recognition
    Sun, Yanjing
    Huang, Han
    Yun, Xiao
    Yang, Bin
    Dong, Kaiwen
    APPLIED INTELLIGENCE, 2022, 52 (01) : 113 - 126
  • [34] Skeleton-based Human Action Recognition via Large-kernel Attention Graph Convolutional Network
    Liu, Yanan
    Zhang, Hao
    Li, Yanqiu
    He, Kangjian
    Xu, Dan
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2023, 29 (05) : 2575 - 2585
  • [35] Local and global self-attention enhanced graph convolutional network for skeleton-based action recognition
    Wu, Zhize
    Ding, Yue
    Wan, Long
    Li, Teng
    Nian, Fudong
    PATTERN RECOGNITION, 2025, 159
  • [36] Temporal Attention-Augmented Graph Convolutional Network for Efficient Skeleton-Based Human Action Recognition
    Heidari, Negar
    Iosifidis, Alexandros
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7907 - 7914
  • [37] A Vertex-Edge Graph Convolutional Network for Skeleton-Based Action Recognition
    Liu, Kai
    Gao, Lei
    Khan, Naimul Mefraz
    Qi, Lin
    Guan, Ling
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [38] Temporal Receptive Field Graph Convolutional Network for Skeleton-Based Action Recognition
    Zhang, Qingqi
    Wu, Ren
    Nakata, Mitsuru
    Ge, Qi-Wei
    2024 International Technical Conference on Circuits/Systems, Computers, and Communications, ITC-CSCC 2024, 2024,
  • [39] Spatial Graph Convolutional and Temporal Involution Network for Skeleton-based Action Recognition
    Wan, Huifan
    Pan, Guanghui
    Chen, Yu
    Ding, Danni
    Zou, Maoyang
    PROCEEDINGS OF ACM TURING AWARD CELEBRATION CONFERENCE, ACM TURC 2021, 2021, : 204 - 209
  • [40] Part-Level Graph Convolutional Network for Skeleton-Based Action Recognition
    Huang, Linjiang
    Huang, Yan
    Ouyang, Wanli
    Wang, Liang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11045 - 11052