SKIM: Skeleton-Based Isolated Sign Language Recognition With Part Mixing

被引:3
|
作者
Lin, Kezhou [1 ]
Wang, Xiaohan [2 ]
Zhu, Linchao [1 ]
Zhang, Bang [3 ]
Yang, Yi [1 ]
机构
[1] Zhejiang Univ, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou 311121, Peoples R China
关键词
Sign language; Face recognition; Biological system modeling; Manuals; Benchmark testing; Assistive technologies; Data augmentation; sign language recognition; skeleton; MODEL;
D O I
10.1109/TMM.2023.3321502
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we present skeleton-based isolated sign language recognition (IsoSLR) with part mixing - SKIM. An IsoSLR model that solely takes the skeleton representation of the human body as input. Previous skeleton-based works either perform worse when compared to RGB-based counterparts or require fusion with other modalities to obtain competitive results. With SKIM, a single skeleton-based model without complex pre-training can obtain similar or even higher accuracy than current state-of-the-art methods. This margin can be further increased by simple late fusion within the same modality. To achieve this, we first develop a novel data augmentation technique called part mixing. It swaps the corresponding keypoints within one region (e.g. hand) between two randomly selected samples and combines their labels linearly as the new label. As regions like hand and face are key articulators for sign language, direct swapping of such parts creates a believable pseudo sign that promotes the model to recognize the true pairs. Secondly, following current advances in skeleton-based action recognition, we devise a channel-wise graph neural network with multi-scale awareness and per-keypoint temporal re-weighting. With this design, the backbone is capable of leveraging both manual and non-manual features. The combination of hand mixing and the channel-wise multi-scale GCN backbone allows us to achieve state-of-the-art accuracy on both WLASL and NMFs-CSL benchmarks.
引用
收藏
页码:4271 / 4280
页数:10
相关论文
共 50 条
  • [31] Occluded Part-aware Graph Convolutional Networks for Skeleton-based Action Recognition
    Kim, Min Hyuk
    Kim, Min Ju
    Yoo, Seok Bong
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 7310 - 7317
  • [32] Multi-Part Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition
    Wang, Wei
    Xie, Wei
    Tu, Zhigang
    Li, Wanxin
    Jin, Lianghao
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [33] A Novel Skeleton Spatial Pyramid Model for Skeleton-based Action Recognition
    Li, Yanshan
    Guo, Tianyu
    Xia, Rongjie
    Liu, Xing
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 16 - 20
  • [34] Skeleton MixFormer: Multivariate Topology Representation for Skeleton-based Action Recognition
    Xin, Wentian
    Miao, Qiguang
    Liu, Yi
    Liu, Ruyi
    Pun, Chi-Man
    Shi, Cheng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2211 - 2220
  • [35] Hand Gesture Recognition for Sign Language: A Skeleton Approach
    Kumar, Y. H. Sharath
    Vinutha, V.
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2015, 2016, 404 : 611 - 623
  • [36] Isolated sign language characters recognition
    Santosa, Paulus Insap
    Telkomnika, 2013, 11 (03): : 583 - 590
  • [37] Skeleton-Based Action Recognition with Combined Part-Wise Topology Graph Convolutional Networks
    Zhu, Xiaowei
    Huang, Qian
    Li, Chang
    Cui, Jingwen
    Chen, Yingying
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 43 - 59
  • [38] Temporal Extension Module for Skeleton-Based Action Recognition
    Obinata, Yuya
    Yamamoto, Takuma
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 534 - 540
  • [39] Adversarial Attack on Skeleton-Based Human Action Recognition
    Liu, Jian
    Akhtar, Naveed
    Mian, Ajmal
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1609 - 1622
  • [40] Fully Attentional Network for Skeleton-Based Action Recognition
    Liu, Caifeng
    Zhou, Hongcheng
    IEEE ACCESS, 2023, 11 : 20478 - 20485