MCDGait: multimodal co-learning distillation network with spatial-temporal graph reasoning for gait recognition in the wild

被引：0

作者：

Xiong, Jianbo ^{[1
]}

Zou, Shinan ^{[1
]}

Tang, Jin ^{[1
]}

Tjahjadi, Tardi ^{[2
]}

机构：

[1] Cent South Univ, Sch Automation, Changsha, Peoples R China

[2] Univ Warwick, Sch Engn, Coventry, England

来源：

VISUAL COMPUTER | 2024年 / 40卷 / 10期

关键词：

Biometrics; Human identification; Gait recognition; Multimodal co-learning distillation; Spatial-temporal graph reasoning;

D O I：

10.1007/s00371-024-03426-y

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Gait recognition in the wild has attracted the attention of the academic community. However, existing unimodal algorithms cannot achieve the same performance on in-the-wild datasets as in-the-lab datasets because unimodal data have many limitations in-the-wild environments. Therefore, we propose a multimodal approach combining silhouettes and skeletons and formulate the multimodal gait recognition problem as a multimodal co-learning problem. In particular, we propose a multimodal co-learning distillation network (MCDGait) that integrates two sub-networks processing unimodal data into a single fusion network. Based on the semantic consistency of different modalities and the paradigm of deep mutual learning, the performance of the entire network is continuously improved via the bidirectional knowledge distillation between the sub-networks and fusion network. Inspired by the observation that specific body parts or joints exhibit unique motion characteristics and have linkage with other parts or joints during walking, we propose a spatial-temporal graph reasoning module (ST-GRM). This module represents the parts or joints as graph nodes and the motion linkages between them as edges. By utilizing dynamic graph generator, the module implicitly captures the dynamic changes of the human body. Based on the generated graphs, the independent spatial-temporal linkage feature of each part and the interactive spatial-temporal linkage feature are aggregated simultaneously. Extensive experiments conducted on two in-the-wild datasets demonstrate the state-of-the-art performance of the proposed method. The average rank-1 accuracy on datasets Gait3D and GREW is 50.90% and 58.06%, respectively. The source code can be obtained from https://github.com/BoyeXiong/MCDGait.

引用

页码：7221 / 7234

页数：14

共 50 条

[31] Dynamic spatial-temporal topology graph network for skeleton-based action recognition
Chen, Lian
Lu, Ke
Niu, Zehai
Wei, Runchen
Xue, Jian
MULTIMEDIA SYSTEMS, 2024, 30 (06)
[32] A New Partitioned Spatial-Temporal Graph Attention Convolution Network for Human Motion Recognition
Guo, Keyou
Wang, Pengshuo
Shi, Peipeng
He, Chengbo
Wei, Caili
APPLIED SCIENCES-BASEL, 2023, 13 (03):
[33] A Separable Spatial-Temporal Graph Learning Approach for Skeleton-Based Action Recognition
Zheng, Hui
Zhao, Ye-Sheng
Zhang, Bo
Shang, Guo-Qiang
IEEE SENSORS LETTERS, 2024, 8 (11)
[34] Adaptive Spatial-Temporal Aware Graph Learning for EEG-Based Emotion Recognition
Ye, Weishan
Wang, Jiyuan
Chen, Lin
Dai, Lifei
Sun, Zhe
Liang, Zhen
CYBORG AND BIONIC SYSTEMS, 2024, 5
[35] An attentional spatial temporal graph convolutional network with co-occurrence feature learning for action recognition
Dong Tian
Zhe-Ming Lu
Xiao Chen
Long-Hua Ma
Multimedia Tools and Applications, 2020, 79 : 12679 - 12697
[36] An attentional spatial temporal graph convolutional network with co-occurrence feature learning for action recognition
Tian, Dong
Lu, Zhe-Ming
Chen, Xiao
Ma, Long-Hua
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (17-18) : 12679 - 12697
[37] Skeletal Spatial-Temporal Semantics Guided Homogeneous-Heterogeneous Multimodal Network for Action Recognition
Zhang, Chenwei
Hu, Yuxuan
Yang, Min
Li, Chengming
Hu, Xiping
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3657 - 3666
[38] Attentive graph structure learning embedded in deep spatial-temporal graph neural network for traffic forecasting
Pritam Bikram
Shubhajyoti Das
Arindam Biswas
Applied Intelligence, 2024, 54 : 2716 - 2749
[39] Attentive graph structure learning embedded in deep spatial-temporal graph neural network for traffic forecasting
Bikram, Pritam
Das, Shubhajyoti
Biswas, Arindam
APPLIED INTELLIGENCE, 2024, 54 (03) : 2716 - 2749
[40] Multimodal joint prediction of traffic spatial-temporal data with graph sparse attention mechanism and bidirectional temporal convolutional network
Zhang, Dongran
Yan, Jiangnan
Polat, Kemal
Alhudhaif, Adi
Li, Jun
ADVANCED ENGINEERING INFORMATICS, 2024, 62

← 1 2 3 4 5 →