SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS

被引:597
|
作者
Zhang, Qing-Long [1 ]
Yang, Yu-Bin [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China
关键词
spatial attention; channel attention; channel shuffle; grouped features;
D O I
10.1109/ICASSP39728.2021.9414568
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Attention mechanisms, which enable a neural network to accurately focus on all the relevant elements of the input, have become an essential component to improve the performance of deep neural networks. There are mainly two attention mechanisms widely used in computer vision studies, spatial attention and channel attention, which aim to capture the pixel-level pairwise relationship and channel dependency, respectively. Although fusing them together may achieve better performance than their individual implementations, it will inevitably increase the computational overhead. In this paper, we propose an efficient Shuffle Attention (SA) module to address this issue, which adopts Shuffle Units to combine two types of attention mechanisms effectively. Specifically, SA first groups channel dimensions into multiple sub-features before processing them in parallel. Then, for each sub-feature, SA utilizes a Shuffle Unit to depict feature dependencies in both spatial and channel dimensions. After that, all sub-features are aggregated and a "channel shuffle" operator is adopted to enable information communication between different sub-features. The proposed SA module is efficient yet effective, e.g., the parameters and computations of SA against the backbone ResNet50 are 300 vs. 25.56M and 2.76e-3 GFLOPs vs. 4.12 GFLOPs, respectively, and the performance boost is more than 1.34% in terms of Top-1 accuracy. Extensive experimental results on commonused benchmarks, including ImageNet-1k for classification, MS COCO for object detection, and instance segmentation, demonstrate that the proposed SA outperforms the current SOTA methods significantly by achieving higher accuracy while having lower model complexity.
引用
收藏
页码:2235 / 2239
页数:5
相关论文
共 50 条
  • [31] A Review on Deep Convolutional Neural Networks
    Aloysius, Neena
    Geetha, M.
    2017 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2017, : 588 - 592
  • [32] Central Attention Mechanism for Convolutional Neural Networks
    Geng, Y.X.
    Wang, L.
    Wang, Z.Y.
    Wang, Y.G.
    IAENG International Journal of Computer Science, 2024, 51 (10) : 1642 - 1648
  • [33] Channel attention for quantum convolutional neural networks
    Budiutama, Gekko
    Daimon, Shunsuke
    Nishi, Hirofumi
    Kaneko, Ryui
    Ohtsuki, Tomi
    Matsushita, Yu-ichiro
    PHYSICAL REVIEW A, 2024, 110 (01)
  • [34] Convergence of deep convolutional neural networks
    Xu, Yuesheng
    Zhang, Haizhang
    NEURAL NETWORKS, 2022, 153 : 553 - 563
  • [35] Spatial deep convolutional neural networks
    Wang, Qi
    Parker, Paul A.
    Lund, Robert
    SPATIAL STATISTICS, 2025, 66
  • [36] Fusion of Deep Convolutional Neural Networks
    Suchy, Robert
    Ezekiel, Soundararajan
    Cornacchia, Maria
    2017 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2017,
  • [37] Visualization of Convolutional Neural Networks with Attention Mechanism
    Yuan, Meng
    Tie, Bao
    Lin, Dawei
    HUMAN CENTERED COMPUTING, HCC 2021, 2022, 13795 : 82 - 93
  • [38] Identification of Apple Leaf Diseases by Improved Deep Convolutional Neural Networks With an Attention Mechanism
    Wang, Peng
    Niu, Tong
    Mao, Yanru
    Zhang, Zhao
    Liu, Bin
    He, Dongjian
    FRONTIERS IN PLANT SCIENCE, 2021, 12
  • [39] Estimating Visual Focus of Attention in Multiparty Meetings using Deep Convolutional Neural Networks
    Otsuka, Kazuhiro
    Kasuga, Keisuke
    Koehler, Martina
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 191 - 199
  • [40] SPAM: Spatially Partitioned Attention Module in Deep Convolutional Neural Networks for Image Classification
    Wang F.
    Qiao R.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2023, 57 (09): : 185 - 192