Human Action Recognition Research Based on Fusion TS-CNN and LSTM Networks

被引:0
|
作者
Hui Zan
Gang Zhao
机构
[1] Zhejiang Normal University,Key Laboratory of Intelligent Education Technology and Application of Zhejiang Province
[2] Central China Normal University,Faculty of Artificial Intelligence in Education
关键词
Multistream network; Human action recognition; TS-LSTM; CNN-LSTM;
D O I
暂无
中图分类号
学科分类号
摘要
Human action recognition (HAR) technology is currently of significant interest. The traditional HAR methods depend on the time and space of the video stream generally. It requires a mass of training datasets and produces a long response time, failing to simultaneously meet the real-time interaction technical requirements-high accuracy, low delay, and low computational cost. For instance, the duration of a gymnastic action is as short as 0.2 s, from action capture to recognition, and then to the visualization of a three-dimensional character model. Only when the response time of the application system is short enough can it guide synchronous training and accurate evaluation. To reduce the dependence on the amount of video data and meet the HAR technical requirements, this paper proposes a three-stream long-short term memory (TS-CNN-LSTM) framework combining the CNN and LSTM networks. Firstly, human data of color, depth, and skeleton collected by Microsoft Kinect are used as input to reduce the sample sizes. Secondly, heterogeneous convolutional networks are established to reduce computing costs and elevate response time. The experiment results demonstrate the effectiveness of the proposed model on the NTU-RGB + D, reaching the best accuracy of 87.28% in the Cross-subject mode. Compared with the state-of-the-art methods, our method uses 75% of the training sample size, while the complexity of time and space only occupies 67.5% and 73.98% respectively. The response time of one set action recognition is improved by 0.90–1.61 s, which is especially valuable for timely action feedback. The proposed method provides an effective solution for real-time interactive applications which require timely human action recognition results and responses.
引用
收藏
页码:2331 / 2345
页数:14
相关论文
共 50 条
  • [31] Dynamic Two Hand Gesture Recognition using CNN-LSTM based networks
    Sharma, Vaidehi
    Jaiswal, Mohita
    Sharma, Abhishek
    Saini, Sandeep
    Tomar, Raghuvir
    2021 IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2021), 2021, : 224 - 229
  • [32] Skeleton Feature Fusion Based on Multi-Stream LSTM for Action Recognition
    Wang, Lei
    Zhao, Xu
    Liu, Yuncai
    IEEE ACCESS, 2018, 6 : 50788 - 50800
  • [33] A multi modal fusion coal gangue recognition method based on IBWO-CNN-LSTM
    Hao, Wenchao
    Jiang, Haiyan
    Song, Qinghui
    Song, Qingjun
    Sun, Shirong
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [34] Combined CNN LSTM with attention for speech emotion recognition based on feature-level fusion
    Liu Y.
    Chen A.
    Zhou G.
    Yi J.
    Xiang J.
    Wang Y.
    Multimedia Tools and Applications, 2024, 83 (21) : 59839 - 59859
  • [35] Contact Pattern Recognition of a Flexible Tactile Sensor Based on the CNN-LSTM Fusion Algorithm
    Song, Yang
    Li, Mingkun
    Wang, Feilu
    Lv, Shanna
    MICROMACHINES, 2022, 13 (07)
  • [36] Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks
    Liu, Jun
    Wang, Gang
    Duan, Ling-Yu
    Abdiyeva, Kamila
    Kot, Alex C.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (04) : 1586 - 1599
  • [37] Research on Transformer Partial Discharge UHF Pattern Recognition Based on Cnn-lstm
    Zhou, Xiu
    Wu, Xutao
    Ding, Pei
    Li, Xiuguang
    He, Ninghui
    Zhang, Guozhi
    Zhang, Xiaoxing
    ENERGIES, 2020, 13 (01)
  • [38] Fusion of spatial and dynamic CNN streams for action recognition
    Newlin Shebiah Russel
    Arivazhagan Selvaraj
    Multimedia Systems, 2021, 27 : 969 - 984
  • [39] Multi-Sensor Data Fusion and CNN-LSTM Model for Human Activity Recognition System
    Zhou, Haiyang
    Zhao, Yixin
    Liu, Yanzhong
    Lu, Sichao
    An, Xiang
    Liu, Qiang
    SENSORS, 2023, 23 (10)
  • [40] Fusion of spatial and dynamic CNN streams for action recognition
    Russel, Newlin Shebiah
    Selvaraj, Arivazhagan
    MULTIMEDIA SYSTEMS, 2021, 27 (05) : 969 - 984