MCDGait: multimodal co-learning distillation network with spatial-temporal graph reasoning for gait recognition in the wild

被引:0
|
作者
Xiong, Jianbo [1 ]
Zou, Shinan [1 ]
Tang, Jin [1 ]
Tjahjadi, Tardi [2 ]
机构
[1] Cent South Univ, Sch Automation, Changsha, Peoples R China
[2] Univ Warwick, Sch Engn, Coventry, England
来源
VISUAL COMPUTER | 2024年 / 40卷 / 10期
关键词
Biometrics; Human identification; Gait recognition; Multimodal co-learning distillation; Spatial-temporal graph reasoning;
D O I
10.1007/s00371-024-03426-y
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Gait recognition in the wild has attracted the attention of the academic community. However, existing unimodal algorithms cannot achieve the same performance on in-the-wild datasets as in-the-lab datasets because unimodal data have many limitations in-the-wild environments. Therefore, we propose a multimodal approach combining silhouettes and skeletons and formulate the multimodal gait recognition problem as a multimodal co-learning problem. In particular, we propose a multimodal co-learning distillation network (MCDGait) that integrates two sub-networks processing unimodal data into a single fusion network. Based on the semantic consistency of different modalities and the paradigm of deep mutual learning, the performance of the entire network is continuously improved via the bidirectional knowledge distillation between the sub-networks and fusion network. Inspired by the observation that specific body parts or joints exhibit unique motion characteristics and have linkage with other parts or joints during walking, we propose a spatial-temporal graph reasoning module (ST-GRM). This module represents the parts or joints as graph nodes and the motion linkages between them as edges. By utilizing dynamic graph generator, the module implicitly captures the dynamic changes of the human body. Based on the generated graphs, the independent spatial-temporal linkage feature of each part and the interactive spatial-temporal linkage feature are aggregated simultaneously. Extensive experiments conducted on two in-the-wild datasets demonstrate the state-of-the-art performance of the proposed method. The average rank-1 accuracy on datasets Gait3D and GREW is 50.90% and 58.06%, respectively. The source code can be obtained from https://github.com/BoyeXiong/MCDGait.
引用
收藏
页码:7221 / 7234
页数:14
相关论文
共 50 条
  • [21] Enhanced spatial-temporal learning network for dynamic facial expression recognition
    Gong, Weijun
    Qian, Yurong
    Zhou, Weihang
    Leng, Hongyong
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 88
  • [22] Multi-View Gait Recognition Based on a Spatial-Temporal Deep Neural Network
    Tong, Suibing
    Fu, Yuzhuo
    Yue, Xinwei
    Ling, Hefei
    IEEE ACCESS, 2018, 6 : 57583 - 57596
  • [23] STJA-GCN: A Multi-Branch Spatial-Temporal Joint Attention Graph Convolutional Network for Abnormal Gait Recognition
    Yin, Ziming
    Jiang, Yi
    Zheng, Jianli
    Yu, Hongliu
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [24] Spatial-Temporal Dynamic Graph Convolutional Network With Interactive Learning for Traffic Forecasting
    Liu, Aoyu
    Zhang, Yaying
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (07) : 7645 - 7660
  • [25] Spatial-Temporal gated graph attention network for skeleton-based action recognition
    Rahevar, Mrugendrasinh
    Ganatra, Amit
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) : 929 - 939
  • [26] Spatial-temporal slowfast graph convolutional network for skeleton-based action recognition
    Fang, Zheng
    Zhang, Xiongwei
    Cao, Tieyong
    Zheng, Yunfei
    Sun, Meng
    IET COMPUTER VISION, 2022, 16 (03) : 205 - 217
  • [27] Spatial-Temporal Dynamic Graph Attention Network for Skeleton-Based Action Recognition
    Rahevar, Mrugendrasinh
    Ganatra, Amit
    Saba, Tanzila
    Rehman, Amjad
    Bahaj, Saeed Ali
    IEEE ACCESS, 2023, 11 : 21546 - 21553
  • [28] Multilevel Spatial-Temporal Excited Graph Network for Skeleton-Based Action Recognition
    Zhu, Yisheng
    Shuai, Hui
    Liu, Guangcan
    Liu, Qingshan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 496 - 508
  • [29] An Attention Enhanced Spatial-Temporal Graph Convolutional LSTM Network for Action Recognition in Karate
    Guo, Jianping
    Liu, Hong
    Li, Xi
    Xu, Dahong
    Zhang, Yihan
    APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [30] Spatial-Temporal Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition
    Hang, Rui
    Li, MinXian
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 172 - 188