Human Action Recognition Research Based on Fusion TS-CNN and LSTM Networks

被引:0
|
作者
Hui Zan
Gang Zhao
机构
[1] Zhejiang Normal University,Key Laboratory of Intelligent Education Technology and Application of Zhejiang Province
[2] Central China Normal University,Faculty of Artificial Intelligence in Education
关键词
Multistream network; Human action recognition; TS-LSTM; CNN-LSTM;
D O I
暂无
中图分类号
学科分类号
摘要
Human action recognition (HAR) technology is currently of significant interest. The traditional HAR methods depend on the time and space of the video stream generally. It requires a mass of training datasets and produces a long response time, failing to simultaneously meet the real-time interaction technical requirements-high accuracy, low delay, and low computational cost. For instance, the duration of a gymnastic action is as short as 0.2 s, from action capture to recognition, and then to the visualization of a three-dimensional character model. Only when the response time of the application system is short enough can it guide synchronous training and accurate evaluation. To reduce the dependence on the amount of video data and meet the HAR technical requirements, this paper proposes a three-stream long-short term memory (TS-CNN-LSTM) framework combining the CNN and LSTM networks. Firstly, human data of color, depth, and skeleton collected by Microsoft Kinect are used as input to reduce the sample sizes. Secondly, heterogeneous convolutional networks are established to reduce computing costs and elevate response time. The experiment results demonstrate the effectiveness of the proposed model on the NTU-RGB + D, reaching the best accuracy of 87.28% in the Cross-subject mode. Compared with the state-of-the-art methods, our method uses 75% of the training sample size, while the complexity of time and space only occupies 67.5% and 73.98% respectively. The response time of one set action recognition is improved by 0.90–1.61 s, which is especially valuable for timely action feedback. The proposed method provides an effective solution for real-time interactive applications which require timely human action recognition results and responses.
引用
收藏
页码:2331 / 2345
页数:14
相关论文
共 50 条
  • [1] Human Action Recognition Research Based on Fusion TS-CNN and LSTM Networks
    Zan, Hui
    Zhao, Gang
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (02) : 2331 - 2345
  • [2] CNN-LSTM-Based Late Sensor Fusion for Human Activity Recognition in Big Data Networks
    Baloch, Zartasha
    Shaikh, Faisal Karim
    Unar, Mukhtiar Ali
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [3] Human Activity Recognition Based on CNN and LSTM
    Tan, Xu-Nan
    Journal of Computers (Taiwan), 2023, 34 (03) : 221 - 235
  • [4] Skeleton-based human action recognition with sequential convolutional-LSTM networks and fusion strategies
    Khowaja S.A.
    Lee S.-L.
    Journal of Ambient Intelligence and Humanized Computing, 2022, 13 (08) : 3729 - 3746
  • [5] Human Action Recognition Based on Improved Fusion Attention CNN and RNN
    Zhao, Han
    Jin, Xinyu
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND APPLICATIONS (ICCIA 2020), 2020, : 108 - 112
  • [6] Human action recognition using attention based LSTM network with dilated CNN features
    Muhammad, Khan
    Mustaqeem
    Ullah, Amin
    Imran, Ali Shariq
    Sajjad, Muhammad
    Kiran, Mustafa Servet
    Sannino, Giovanna
    de Albuquerque, Victor Hugo C.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 125 : 820 - 830
  • [7] SKELETON-BASED ACTION RECOGNITION USING LSTM AND CNN
    Li, Chuankun
    Wang, Pichao
    Wang, Shuang
    Hou, Yonghong
    Li, Wanqing
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [8] Multimodal Human Action Recognition Based on a Fusion of Dynamic Images using CNN descriptors
    Cardenas, Edwin Escobedo
    Chavez, Guillermo Camara
    PROCEEDINGS 2018 31ST SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2018, : 95 - 102
  • [9] Cascading Pose Features with CNN-LSTM for Multiview Human Action Recognition
    Malik, Najeeb ur Rehman
    Abu-Bakar, Syed Abdul Rahman
    Sheikh, Usman Ullah
    Channa, Asma
    Popescu, Nirvana
    SIGNALS, 2023, 4 (01): : 40 - 55
  • [10] Human Action Recognition Based on Temporal Pose CNN and Multi-dimensional Fusion
    Huang, Yi
    Lai, Shang-Hong
    Tai, Shao-Heng
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT II, 2019, 11130 : 426 - 440