Chinese Sign Language Recognition with Sequence to Sequence Learning

被引:11
|
作者
Mao, Chensi [1 ]
Huang, Shiliang [1 ]
Li, Xiaoxu [1 ]
Ye, Zhongfu [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Dept Elect Engn & Informat Sci, Hefei 230027, Anhui, Peoples R China
来源
COMPUTER VISION, PT I | 2017年 / 771卷
关键词
Sign language recognition; Long short-term memory; Convolutional neural network; Trajectory;
D O I
10.1007/978-981-10-7299-4_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we formulate Chinese sign language recognition (SLR) as a sequence to sequence problem and propose an encoder-decoder based framework to handle it. The proposed framework is based on the convolutional neural network (CNN) and recurrent neural network (RNN) with long short-term memory (LSTM). Specifically, CNN is adopted to extract the spatial features of input frames. Two LSTM layers are cascaded to implement the structure of encoder-decoder. The encoder-decoder can not only learn the temporal information of the input features but also can learn the context model of sign language words. We feed the images sequences captured by Microsoft Kinect2.0 into the network to build an end-to-end model. Moreover, we also set up another model by using skeletal coordinates as the input of the encoder-decoder framework. In the recognition stage, a probability combination method is proposed to fuse these two models to get the final prediction. We validate our method on the self-build dataset and the experimental results demonstrate the effectiveness of the proposed method.
引用
收藏
页码:180 / 191
页数:12
相关论文
共 50 条
  • [1] Chinese Sign Language Recognition Based On Video Sequence Appearance Modeling
    Quan, Yang
    ICIEA 2010: PROCEEDINGS OF THE 5TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOL 3, 2010, : 385 - 390
  • [2] Multimodal Sign Language Recognition via Temporal Deformable Convolutional Sequence Learning
    Papadimitriou, Katerina
    Potamianos, Gerasimos
    INTERSPEECH 2020, 2020, : 2752 - 2756
  • [3] Continuous sign language recognition based on hierarchical memory sequence network
    Xue, Cuihong
    Jia, Jingli
    Yu, Ming
    Yan, Gang
    Guo, Yingchun
    Liu, Yuehao
    IET COMPUTER VISION, 2024, 18 (02) : 247 - 259
  • [4] Facial Expression Sequence Recognition for a Japanese Sign Language Training System
    Yabunaka, Keisuke
    Mori, Yuichiro
    Toyonaga, Masahiko
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 1348 - 1353
  • [5] MULTILINGUAL SEQUENCE-TO-SEQUENCE SPEECH RECOGNITION: ARCHITECTURE, TRANSFER LEARNING, AND LANGUAGE MODELING
    Cho, Jaejin
    Baskar, Murali Karthick
    Li, Ruizhi
    Wiesner, Matthew
    Mallidi, Sri Harish
    Yalta, Nelson
    Karafiat, Martin
    Watanabe, Shinji
    Hori, Takaaki
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 521 - 527
  • [6] Recognition of Japanese sign language from image sequence using color combination
    Yoshino, K
    Kawashima, T
    Aoki, Y
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, PROCEEDINGS - VOL III, 1996, : 511 - 514
  • [7] SeqαGAN: Sign Language Sequence Generation Based on Variational and Adversarial Learning
    Xiao, Qinkun
    Li, Lu
    Zhu, Yilin
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (07) : 9320 - 9329
  • [8] Hallucination of Speech Recognition Errors With Sequence to Sequence Learning
    Serai, Prashant
    Sunder, Vishal
    Fosler-Lussier, Eric
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 890 - 900
  • [9] Sequence-to-Sequence Contrastive Learning for Text Recognition
    Aberdam, Aviad
    Litman, Ron
    Tsiper, Shahar
    Anschel, Oron
    Slossberg, Ron
    Mazor, Shai
    Manmatha, R.
    Perona, Pietro
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15297 - 15307
  • [10] A Sequence to Sequence Learning for Chinese Grammatical Error Correction
    Ren, Hongkai
    Yang, Liner
    Xun, Endong
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2018, PT II, 2018, 11109 : 401 - 410