SKIM: Skeleton-Based Isolated Sign Language Recognition With Part Mixing

被引：3

作者：

Lin, Kezhou ^{[1
]}

Wang, Xiaohan ^{[2
]}

Zhu, Linchao ^{[1
]}

Zhang, Bang ^{[3
]}

Yang, Yi ^{[1
]}

机构：

[1] Zhejiang Univ, Hangzhou 310027, Peoples R China

[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China

[3] Alibaba Grp, DAMO Acad, Hangzhou 311121, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2024年 / 26卷

关键词：

Sign language; Face recognition; Biological system modeling; Manuals; Benchmark testing; Assistive technologies; Data augmentation; sign language recognition; skeleton; MODEL;

D O I：

10.1109/TMM.2023.3321502

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this article, we present skeleton-based isolated sign language recognition (IsoSLR) with part mixing - SKIM. An IsoSLR model that solely takes the skeleton representation of the human body as input. Previous skeleton-based works either perform worse when compared to RGB-based counterparts or require fusion with other modalities to obtain competitive results. With SKIM, a single skeleton-based model without complex pre-training can obtain similar or even higher accuracy than current state-of-the-art methods. This margin can be further increased by simple late fusion within the same modality. To achieve this, we first develop a novel data augmentation technique called part mixing. It swaps the corresponding keypoints within one region (e.g. hand) between two randomly selected samples and combines their labels linearly as the new label. As regions like hand and face are key articulators for sign language, direct swapping of such parts creates a believable pseudo sign that promotes the model to recognize the true pairs. Secondly, following current advances in skeleton-based action recognition, we devise a channel-wise graph neural network with multi-scale awareness and per-keypoint temporal re-weighting. With this design, the backbone is capable of leveraging both manual and non-manual features. The combination of hand mixing and the channel-wise multi-scale GCN backbone allows us to achieve state-of-the-art accuracy on both WLASL and NMFs-CSL benchmarks.

引用

页码：4271 / 4280

页数：10

共 50 条

[21] Language-guided temporal primitive modeling for skeleton-based action recognition
Pan, Qingzhe
Xie, Xuemei
NEUROCOMPUTING, 2025, 613
[22] DHF-SLR: Dual-Hand Multi-Stream Fusion Network for Skeleton-Based Sign Language Recognition
Zhang, Meiqi
Gao, Qing
Ju, Zhaojie
2024 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS, ICARM 2024, 2024, : 649 - 654
[23] RELATIONAL NETWORK FOR SKELETON-BASED ACTION RECOGNITION
Zheng, Wu
Li, Lin
Zhang, Zhaoxiang
Huang, Yan
Wang, Liang
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 826 - 831
[24] Skeleton-based abnormal gait recognition: a survey
Tian H.-Y.
Ma X.
Li Y.-B.
Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2022, 52 (04): : 725 - 737
[25] Skeleton-based Dynamic hand gesture recognition
De Smedt, Quentin
Wannous, Hazem
Vandeborre, Jean-Philippe
PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 1206 - 1214
[26] Japanese Sign Language Recognition by Combining Joint Skeleton-Based Handcrafted and Pixel-Based Deep Learning Features with Machine Learning Classification
Shin, Jungpil
Hasan, Md. Al Mehedi
Miah, Abu Saleh Musa
Suzuki, Kota
Hirooka, Koki
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2024, 139 (03): : 2605 - 2625
[27] SpatioTemporal focus for skeleton-based action recognition
Wu, Liyu
Zhang, Can
Zou, Yuexian
PATTERN RECOGNITION, 2023, 136
[28] Part-Wise Topology Graph Convolutional Network for Skeleton-Based Action Recognition
Zhu, Xiaowei
Huang, Qian
Li, Chang
Wang, Lulu
Miao, Zhuang
ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 317 - 329
[29] Whole and Part Adaptive Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition
Zuo, Qi
Zou, Lian
Fan, Cien
Li, Dongqian
Jiang, Hao
Liu, Yifeng
SENSORS, 2020, 20 (24) : 1 - 20
[30] PART AWARE GRAPH CONVOLUTION NETWORK WITH TEMPORAL ENHANCEMENT FOR SKELETON-BASED ACTION RECOGNITION
Huang, Qian
Nie, Yunqing
Li, Xing
Yang, Tianjin
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3255 - 3259

← 1 2 3 4 5 →