AMA: attention-based multi-feature aggregation module for action recognition

被引:1
|
作者
Yu, Mengyun [1 ]
Chen, Ying [1 ]
机构
[1] Jiangnan Univ, Minist Educ, Key Lab Adv Proc Control Light Ind, Wuxi 214000, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Channel excitation; Spatial-temporal aggregation; Convolution neural network; FRAMEWORK;
D O I
10.1007/s11760-022-02268-2
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Spatial information learning, temporal modeling and channel relationships capturing are important for action recognition in videos. In this work, an attention-based multi-feature aggregation (AMA) module that encodes the above features in a unified module is proposed, which contains a spatial-temporal aggregation (STA) structure and a channel excitation (CE) structure. STA mainly employs two convolutions to model spatial and temporal features, respectively. The matrix multiplication in STA has the ability of capturing long-range dependencies. The CE learns the importance of each channel, so as to bias the allocation of available resources toward the informative features. AMA module is simple yet efficient enough that can be inserted into a standard ResNet architecture without any modification. In this way, the representation of the network can be enhanced. We equip ResNet-50 with AMA module to build an effective AMA Net with limited extra computation cost, only 1.002 times that of ResNet-50. Extensive experiments indicate that AMA Net outperforms the state-of-the-art methods on UCF101 and HMDB51, which is 6.2% and 10.0% higher than the baseline. In short, AMA Net achieves the high accuracy of 3D convolutional neural networks and maintains the complexity of 2D convolutional neural networks simultaneously.
引用
收藏
页码:619 / 626
页数:8
相关论文
共 50 条
  • [1] AMA: attention-based multi-feature aggregation module for action recognition
    Mengyun Yu
    Ying Chen
    Signal, Image and Video Processing, 2023, 17 : 619 - 626
  • [2] AMFF: A new attention-based multi-feature fusion method for intention recognition
    Liu, Cong
    Xu, Xiaolong
    KNOWLEDGE-BASED SYSTEMS, 2021, 233
  • [3] Multi-feature Fusion Action Recognition Based on Key Frames
    Zhao, Yuerong
    Gao, Ling
    He, Dan
    Guo, Hongbo
    Wang, Hai
    Zheng, Jie
    Yang, Xudong
    2019 SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2019, : 279 - 284
  • [4] Action Recognition Based on Multi-feature Depth Motion Maps
    Wang, Dongli
    Ou, Fang
    Zhou, Yan
    IECON 2018 - 44TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2018, : 2683 - 2688
  • [5] Human Action Recognition Based on Skeleton Information and Multi-Feature Fusion
    Wang, Li
    Su, Bo
    Liu, Qunpo
    Gao, Ruxin
    Zhang, Jianjun
    Wang, Guodong
    ELECTRONICS, 2023, 12 (17)
  • [6] Robust Multi-Feature Learning for Skeleton-Based Action Recognition
    Wang, Yingfu
    Xu, Zheyuan
    Li, Li
    Yao, Jian
    IEEE ACCESS, 2019, 7 : 148658 - 148671
  • [7] Human Action Recognition Algorithm Based on Multi-Feature Map Fusion
    Wang, Haofei
    Li, Junfeng
    IEEE ACCESS, 2020, 8 : 150945 - 150954
  • [8] Medical Named Entity Recognition Based on Multi-Feature and Co-Attention
    Xinning, L.I.U.
    Computer Engineering and Applications, 2024, 60 (06) : 188 - 198
  • [9] Human action recognition using multi-feature fusion
    Shao, Yan-Hua, 1818, Board of Optronics Lasers (25):
  • [10] Action Recognition of Basketball Players Based on Hybrid Attention Module and Spatial Feature Pyramid Module
    Tan, Zhihua
    Gao, Sheng
    Wei, Shihai
    Zhang, Jingyu
    Zhu, Min
    JOURNAL OF INTERNET TECHNOLOGY, 2025, 26 (02): : 211 - 218