MCDGait: multimodal co-learning distillation network with spatial-temporal graph reasoning for gait recognition in the wild

被引:0
|
作者
Xiong, Jianbo [1 ]
Zou, Shinan [1 ]
Tang, Jin [1 ]
Tjahjadi, Tardi [2 ]
机构
[1] Cent South Univ, Sch Automation, Changsha, Peoples R China
[2] Univ Warwick, Sch Engn, Coventry, England
来源
VISUAL COMPUTER | 2024年 / 40卷 / 10期
关键词
Biometrics; Human identification; Gait recognition; Multimodal co-learning distillation; Spatial-temporal graph reasoning;
D O I
10.1007/s00371-024-03426-y
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Gait recognition in the wild has attracted the attention of the academic community. However, existing unimodal algorithms cannot achieve the same performance on in-the-wild datasets as in-the-lab datasets because unimodal data have many limitations in-the-wild environments. Therefore, we propose a multimodal approach combining silhouettes and skeletons and formulate the multimodal gait recognition problem as a multimodal co-learning problem. In particular, we propose a multimodal co-learning distillation network (MCDGait) that integrates two sub-networks processing unimodal data into a single fusion network. Based on the semantic consistency of different modalities and the paradigm of deep mutual learning, the performance of the entire network is continuously improved via the bidirectional knowledge distillation between the sub-networks and fusion network. Inspired by the observation that specific body parts or joints exhibit unique motion characteristics and have linkage with other parts or joints during walking, we propose a spatial-temporal graph reasoning module (ST-GRM). This module represents the parts or joints as graph nodes and the motion linkages between them as edges. By utilizing dynamic graph generator, the module implicitly captures the dynamic changes of the human body. Based on the generated graphs, the independent spatial-temporal linkage feature of each part and the interactive spatial-temporal linkage feature are aggregated simultaneously. Extensive experiments conducted on two in-the-wild datasets demonstrate the state-of-the-art performance of the proposed method. The average rank-1 accuracy on datasets Gait3D and GREW is 50.90% and 58.06%, respectively. The source code can be obtained from https://github.com/BoyeXiong/MCDGait.
引用
收藏
页码:7221 / 7234
页数:14
相关论文
共 50 条
  • [1] Gait Recognition Algorithm based on Spatial-temporal Graph Neural Network
    Zhou, Jian
    Yan, Shi
    Zhang, Jie
    2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 63 - 67
  • [2] Gait Recognition Algorithm based on Spatial-temporal Graph Neural Network
    Shi, Huan
    Hui, Bo
    Hu, Biao
    Gu, RongJie
    2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 59 - 62
  • [3] Gait Recognition Algorithm based on Spatial-temporal Graph Neural Network
    Lan, TianYi
    Shi, ZongBin
    Wang, KeJun
    Yin, ChaoQun
    2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 55 - 58
  • [4] Spatial-Temporal Pyramid Graph Reasoning for Action Recognition
    Geng, Tiantian
    Zheng, Feng
    Hou, Xiaorong
    Lu, Ke
    Qi, Guo-Jun
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5484 - 5497
  • [5] A Dual Attention Spatial-Temporal Graph Convolutional Network for Emotion Recognition from Gait
    Liu, Jiaqing
    Kisita, Shoji
    Chai, Shurong
    Tateyama, Tomoko
    Iwamoto, Yutaro
    Chen, Yen-Wei
    Journal of the Institute of Image Electronics Engineers of Japan, 2022, 51 (04): : 309 - 317
  • [6] Multimodal Fusion of Spatial-Temporal Features for Emotion Recognition in the Wild
    Wang, Zuchen
    Fang, Yuchun
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I, 2018, 10735 : 205 - 214
  • [7] Transformer-Based Multimodal Spatial-Temporal Fusion for Gait Recognition
    Zhang, Jikai
    Ji, Mengyu
    He, Yihao
    Guo, Dongliang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XV, 2025, 15045 : 494 - 507
  • [8] Complex Event Recognition via Spatial-Temporal Relation Graph Reasoning
    Lin, Huan
    Zhao, Hongtian
    Yang, Hua
    2021 INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2021,
  • [9] Efficient Gait Recognition via Spatial-Temporal Decoupled Network
    Tang, Peisen
    Su, Han
    Gao, Ruixuan
    Zhao, Wensheng
    Tang, Chaoying
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [10] Action Recognition Using a Spatial-Temporal Network for Wild Felines
    Feng, Liqi
    Zhao, Yaqin
    Sun, Yichao
    Zhao, Wenxuan
    Tang, Jiaxi
    ANIMALS, 2021, 11 (02): : 1 - 18