Separable 3D residual attention network for human action recognition

被引:1
|
作者
Zhang, Zufan [1 ]
Peng, Yue [1 ]
Gan, Chenquan [1 ]
Abate, Andrea Francesco [2 ]
Zhu, Lianxiang [3 ]
机构
[1] Chongqing Univ Posts & Telecommun, Sch Commun & Informat Engn, Chongqing 400065, Peoples R China
[2] Univ Salerno, Dept Comp Sci, Via Giovanni Paolo II 132, I-84084 Fisciano, SA, Italy
[3] Xian Shiyou Univ, Sch Comp Sci, Xian 710065, Peoples R China
关键词
Human computer interaction; Human action recognition; Residual network; Attention mechanism; Multi-stage training strategy; SPATIAL-TEMPORAL ATTENTION; FEATURES; LSTM;
D O I
10.1007/s11042-022-12972-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an important research issue in computer vision, human action recognition has been regarded as a crucial mean of communication and interaction between humans and computers. To help computers automatically recognize human behaviors and accurately understand human intentions, this paper proposes a separable three-dimensional residual attention network (defined as Sep-3D RAN), which is a lightweight network and can extract the informative spatial-temporal representations for the applications of video-based human computer interaction. Specifically, Sep-3D RAN is constructed via stacking multiple separable three-dimensional residual attention blocks, in which each standard three-dimensional convolution is approximated as a cascaded two-dimensional spatial convolution and a one-dimensional temporal convolution, and then a dual attention mechanism is built by embedding a channel attention sub-module and a spatial attention sub-module sequentially in each residual block, thereby acquiring more discriminative features to improve the model guidance capability. Furthermore, a multi-stage training strategy is used for Sep-3D RAN training, which can relieve the over-fitting effectively. Finally, experimental results demonstrate that the performance of Sep-3D RAN can surpass the existing state-of-the-art methods.
引用
收藏
页码:5435 / 5453
页数:19
相关论文
共 50 条
  • [1] Separable 3D residual attention network for human action recognition
    Zufan Zhang
    Yue Peng
    Chenquan Gan
    Andrea Francesco Abate
    Lianxiang Zhu
    Multimedia Tools and Applications, 2023, 82 : 5435 - 5453
  • [2] AR3D: Attention Residual 3D Network for Human Action Recognition
    Dong, Min
    Fang, Zhenglin
    Li, Yongfa
    Bi, Sheng
    Chen, Jiangcheng
    SENSORS, 2021, 21 (05) : 1 - 15
  • [3] 3D RANs: 3D Residual Attention Networks for action recognition
    Jiahui Cai
    Jianguo Hu
    The Visual Computer, 2020, 36 : 1261 - 1270
  • [4] 3D RANs: 3D Residual Attention Networks for action recognition
    Cai, Jiahui
    Hu, Jianguo
    VISUAL COMPUTER, 2020, 36 (06): : 1261 - 1270
  • [5] Weakly-supervised temporal attention 3D network for human action recognition
    Kim, Jonghyun
    Li, Gen
    Yun, Inyong
    Jung, Cheolkon
    Kim, Joongkyu
    PATTERN RECOGNITION, 2021, 119
  • [6] Human Action Recognition with 3D Convolutional Neural Network
    Lima, Tiago
    Fernandes, Bruno
    Barros, Pablo
    2017 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2017,
  • [7] 3D Residual Networks with Channel-Spatial Attention Module for Action Recognition
    Yi, Ziwen
    Sun, Zhonghua
    Feng, Jinchao
    Jia, Kebin
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 5171 - 5174
  • [8] Multi-cue based 3D residual network for action recognition
    Zong, Ming
    Wang, Ruili
    Chen, Zhe
    Wang, Maoli
    Wang, Xun
    Potgieter, Johan
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10): : 5167 - 5181
  • [9] Multi-cue based 3D residual network for action recognition
    Ming Zong
    Ruili Wang
    Zhe Chen
    Maoli Wang
    Xun Wang
    Johan Potgieter
    Neural Computing and Applications, 2021, 33 : 5167 - 5181
  • [10] A stroke image recognition model based on 3D residual network and attention mechanism
    Hou, Yingan
    Su, Junguang
    Liang, Jun
    Chen, Xiwen
    Liu, Qin
    Deng, Liang
    Liao, Jiyuan
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (04) : 5205 - 5214