Spatial-temporal interaction learning based two-stream network for action recognition

被引:38
|
作者
Liu, Tianyu [1 ]
Ma, Yujun [2 ]
Yang, Wenhan [1 ]
Ji, Wanting [3 ]
Wang, Ruili [2 ]
Jiang, Ping [1 ]
机构
[1] Hunan Agr Univ, Coll Mech & Elect Engn, Changsha, Peoples R China
[2] Massey Univ, Sch Math & Computat Sci, Auckland, New Zealand
[3] Liaoning Univ, Sch Informat, Shenyang, Peoples R China
关键词
Action recognition; Spatial-temporal; Two-stream CNNs;
D O I
10.1016/j.ins.2022.05.092
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Two-stream convolutional neural networks have been widely applied to action recognition. However, two-stream networks are usually adopted to capture spatial information and temporal information separately, which normally ignore the strong complementarity and correlation between spatial and temporal information in videos. To solve this problem, we propose a Spatial-Temporal Interaction Learning Two-stream network (STILT) for action recognition. Our proposed two-stream (i.e., a spatial stream and a temporal stream) network has a spatial-temporal interaction learning module, which uses an alternating co attention mechanism between two streams to learn the correlation between spatial features and temporal features. The spatial-temporal interaction learning module allows the two streams to guide each other and then generates optimized spatial attention features and temporal attention features. Thus, the proposed network can establish the interactive connection between two streams, which efficiently exploits the attended spatial and temporal features to improve recognition accuracy. Experiments on three widely used datasets (i.e., UCF101, HMDB51 and Kinetics) show that the proposed network outperforms the state-of-the-art models in action recognition.(c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:864 / 876
页数:13
相关论文
共 50 条
  • [31] Spatial-temporal pyramid based Convolutional Neural Network for action recognition
    Zheng, Zhenxing
    An, Gaoyun
    Wu, Dapeng
    Ruan, Qiuqi
    NEUROCOMPUTING, 2019, 358 : 446 - 455
  • [32] Two-Stream Temporal Convolutional Networks for Skeleton-Based Human Action Recognition
    Jia, Jin-Gong
    Zhou, Yuan-Feng
    Hao, Xing-Wei
    Li, Feng
    Desrosiers, Christian
    Zhang, Cai-Ming
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (03) : 538 - 550
  • [33] Two-Stream Temporal Convolutional Networks for Skeleton-Based Human Action Recognition
    Jin-Gong Jia
    Yuan-Feng Zhou
    Xing-Wei Hao
    Feng Li
    Christian Desrosiers
    Cai-Ming Zhang
    Journal of Computer Science and Technology, 2020, 35 : 538 - 550
  • [34] Two-Stream Convolution Neural Network with Video-stream for Action Recognition
    Dai, Wei
    Chen, Yimin
    Huang, Chen
    Gao, Ming-Ke
    Zhang, Xinyu
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [35] Spatial-Temporal Convolutional Attention Network for Action Recognition
    Luo, Huilan
    Chen, Han
    Computer Engineering and Applications, 2023, 59 (09): : 150 - 158
  • [36] Spatial-Temporal Interleaved Network for Efficient Action Recognition
    Jiang, Shengqin
    Zhang, Haokui
    Qi, Yuankai
    Liu, Qingshan
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025, 21 (01) : 178 - 187
  • [37] Two-Stream Spatial-Temporal Graph Convolutional Networks for Driver Drowsiness Detection
    Bai, Jing
    Yu, Wentao
    Xiao, Zhu
    Havyarimana, Vincent
    Regan, Amelia C.
    Jiang, Hongbo
    Jiao, Licheng
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (12) : 13821 - 13833
  • [38] Multi-Stream and Enhanced Spatial-Temporal Graph Convolution Network for Skeleton-Based Action Recognition
    Li, Fanjia
    Zhu, Aichun
    Xu, Yonggang
    Cui, Ran
    Hua, Gang
    IEEE ACCESS, 2020, 8 : 97757 - 97770
  • [39] Interactive two-stream graph neural network for skeleton-based action recognition
    Yang, Dun
    Zhou, Qing
    Wen, Ju
    JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (03)
  • [40] TBRNet: Two-Stream BiLSTM Residual Network for Video Action Recognition
    Wu, Xiao
    Ji, Qingge
    ALGORITHMS, 2020, 13 (07) : 1 - 21