SKIM: Skeleton-Based Isolated Sign Language Recognition With Part Mixing

被引:3
|
作者
Lin, Kezhou [1 ]
Wang, Xiaohan [2 ]
Zhu, Linchao [1 ]
Zhang, Bang [3 ]
Yang, Yi [1 ]
机构
[1] Zhejiang Univ, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou 311121, Peoples R China
关键词
Sign language; Face recognition; Biological system modeling; Manuals; Benchmark testing; Assistive technologies; Data augmentation; sign language recognition; skeleton; MODEL;
D O I
10.1109/TMM.2023.3321502
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we present skeleton-based isolated sign language recognition (IsoSLR) with part mixing - SKIM. An IsoSLR model that solely takes the skeleton representation of the human body as input. Previous skeleton-based works either perform worse when compared to RGB-based counterparts or require fusion with other modalities to obtain competitive results. With SKIM, a single skeleton-based model without complex pre-training can obtain similar or even higher accuracy than current state-of-the-art methods. This margin can be further increased by simple late fusion within the same modality. To achieve this, we first develop a novel data augmentation technique called part mixing. It swaps the corresponding keypoints within one region (e.g. hand) between two randomly selected samples and combines their labels linearly as the new label. As regions like hand and face are key articulators for sign language, direct swapping of such parts creates a believable pseudo sign that promotes the model to recognize the true pairs. Secondly, following current advances in skeleton-based action recognition, we devise a channel-wise graph neural network with multi-scale awareness and per-keypoint temporal re-weighting. With this design, the backbone is capable of leveraging both manual and non-manual features. The combination of hand mixing and the channel-wise multi-scale GCN backbone allows us to achieve state-of-the-art accuracy on both WLASL and NMFs-CSL benchmarks.
引用
收藏
页码:4271 / 4280
页数:10
相关论文
共 50 条
  • [1] An effective skeleton-based approach for multilingual sign language recognition
    Renjith, S.
    Suresh, M. S. Sumi
    Rashmi, Manazhy
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 143
  • [2] Skeleton-based Online Sign Language Recognition using Monotonic Attention
    Takayama, Natsuki
    Benitez-Garcia, Gibran
    Takahashi, Hiroki
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 601 - 608
  • [3] Hand Graph Topology Selection for Skeleton-based Sign Language Recognition
    Ozdemir, Ogulcan
    Baytas, Inci M.
    Akarun, Lale
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [4] Skeleton-Based Data Augmentation for Sign Language Recognition Using Adversarial Learning
    Nakamura, Yuriya
    Jing, Lei
    IEEE ACCESS, 2025, 13 : 15290 - 15300
  • [5] Asymmetric multi-branch GCN for skeleton-based sign language recognition
    Liu, Yuhong
    Lu, Fei
    Cheng, Xianpeng
    Yuan, Ying
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (30) : 75293 - 75319
  • [6] Multi-cue temporal modeling for skeleton-based sign language recognition
    Ozdemir, Ogulcan
    Baytas, Inci M.
    Akarun, Lale
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [7] SC2SLR: Skeleton-based Contrast for Sign Language Recognition
    Lyu, Silu
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 404 - 410
  • [8] Bidirectional Skeleton-Based Isolated Sign Recognition using Graph Convolutional Networks
    Dafnis, Konstantinos M.
    Chroni, Evgenia
    Neidle, Carol
    Metaxas, Dimitris N.
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7328 - 7338
  • [9] SML: A Skeleton-based multi-feature learning method for sign language recognition
    Deng, Zhiwen
    Leng, Yuquan
    Hu, Jing
    Lin, Zengrong
    Li, Xuerui
    Gao, Qing
    KNOWLEDGE-BASED SYSTEMS, 2024, 301
  • [10] Isolated Sign Language Recognition based on Tree Structure Skeleton Images
    Laines, David
    Gonzalez-Mendoza, Miguel
    Ochoa-Ruiz, Gilberto
    Bejarano, Gissella
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, : 276 - 284