SKIM: Skeleton-Based Isolated Sign Language Recognition With Part Mixing

被引:3
|
作者
Lin, Kezhou [1 ]
Wang, Xiaohan [2 ]
Zhu, Linchao [1 ]
Zhang, Bang [3 ]
Yang, Yi [1 ]
机构
[1] Zhejiang Univ, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou 311121, Peoples R China
关键词
Sign language; Face recognition; Biological system modeling; Manuals; Benchmark testing; Assistive technologies; Data augmentation; sign language recognition; skeleton; MODEL;
D O I
10.1109/TMM.2023.3321502
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we present skeleton-based isolated sign language recognition (IsoSLR) with part mixing - SKIM. An IsoSLR model that solely takes the skeleton representation of the human body as input. Previous skeleton-based works either perform worse when compared to RGB-based counterparts or require fusion with other modalities to obtain competitive results. With SKIM, a single skeleton-based model without complex pre-training can obtain similar or even higher accuracy than current state-of-the-art methods. This margin can be further increased by simple late fusion within the same modality. To achieve this, we first develop a novel data augmentation technique called part mixing. It swaps the corresponding keypoints within one region (e.g. hand) between two randomly selected samples and combines their labels linearly as the new label. As regions like hand and face are key articulators for sign language, direct swapping of such parts creates a believable pseudo sign that promotes the model to recognize the true pairs. Secondly, following current advances in skeleton-based action recognition, we devise a channel-wise graph neural network with multi-scale awareness and per-keypoint temporal re-weighting. With this design, the backbone is capable of leveraging both manual and non-manual features. The combination of hand mixing and the channel-wise multi-scale GCN backbone allows us to achieve state-of-the-art accuracy on both WLASL and NMFs-CSL benchmarks.
引用
收藏
页码:4271 / 4280
页数:10
相关论文
共 50 条
  • [21] Language-guided temporal primitive modeling for skeleton-based action recognition
    Pan, Qingzhe
    Xie, Xuemei
    NEUROCOMPUTING, 2025, 613
  • [22] DHF-SLR: Dual-Hand Multi-Stream Fusion Network for Skeleton-Based Sign Language Recognition
    Zhang, Meiqi
    Gao, Qing
    Ju, Zhaojie
    2024 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS, ICARM 2024, 2024, : 649 - 654
  • [23] RELATIONAL NETWORK FOR SKELETON-BASED ACTION RECOGNITION
    Zheng, Wu
    Li, Lin
    Zhang, Zhaoxiang
    Huang, Yan
    Wang, Liang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 826 - 831
  • [24] Skeleton-based abnormal gait recognition: a survey
    Tian H.-Y.
    Ma X.
    Li Y.-B.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2022, 52 (04): : 725 - 737
  • [25] Skeleton-based Dynamic hand gesture recognition
    De Smedt, Quentin
    Wannous, Hazem
    Vandeborre, Jean-Philippe
    PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 1206 - 1214
  • [26] Japanese Sign Language Recognition by Combining Joint Skeleton-Based Handcrafted and Pixel-Based Deep Learning Features with Machine Learning Classification
    Shin, Jungpil
    Hasan, Md. Al Mehedi
    Miah, Abu Saleh Musa
    Suzuki, Kota
    Hirooka, Koki
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2024, 139 (03): : 2605 - 2625
  • [27] SpatioTemporal focus for skeleton-based action recognition
    Wu, Liyu
    Zhang, Can
    Zou, Yuexian
    PATTERN RECOGNITION, 2023, 136
  • [28] Part-Wise Topology Graph Convolutional Network for Skeleton-Based Action Recognition
    Zhu, Xiaowei
    Huang, Qian
    Li, Chang
    Wang, Lulu
    Miao, Zhuang
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 317 - 329
  • [29] Whole and Part Adaptive Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition
    Zuo, Qi
    Zou, Lian
    Fan, Cien
    Li, Dongqian
    Jiang, Hao
    Liu, Yifeng
    SENSORS, 2020, 20 (24) : 1 - 20
  • [30] PART AWARE GRAPH CONVOLUTION NETWORK WITH TEMPORAL ENHANCEMENT FOR SKELETON-BASED ACTION RECOGNITION
    Huang, Qian
    Nie, Yunqing
    Li, Xing
    Yang, Tianjin
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3255 - 3259