Swin-Transformer-Enabled YOLOv5 with Attention Mechanism for Small Object Detection on Satellite Images

被引:118
|
作者
Gong, Hang [1 ]
Mu, Tingkui [1 ]
Li, Qiuxia [1 ]
Dai, Haishan [2 ]
Li, Chunlai [3 ]
He, Zhiping [3 ]
Wang, Wenjing [1 ]
Han, Feng [1 ]
Tuniyazi, Abudusalamu [1 ]
Li, Haoyang [1 ]
Lang, Xuechan [1 ]
Li, Zhiyuan [1 ]
Wang, Bin [1 ]
机构
[1] Xi An Jiao Tong Univ, Res Ctr Space Opt & Astron, Sch Phys, MOE Key Lab Nonequilibrium Synth & Modulat Conden, Xian 710049, Peoples R China
[2] Shanghai Acad Spaceflight Technol, Shanghai Inst Satellite Engn, Shanghai 201109, Peoples R China
[3] Chinese Acad Sci, Shanghai Inst Tech Phys, Shanghai 200083, Peoples R China
基金
中国国家自然科学基金;
关键词
satellite images; object detection; self-attention mechanism; Swin transformer; deep learning; CLASSIFICATION;
D O I
10.3390/rs14122861
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Object detection has made tremendous progress in natural images over the last decade. However, the results are hardly satisfactory when the natural image object detection algorithm is directly applied to satellite images. This is due to the intrinsic differences in the scale and orientation of objects generated by the bird's-eye perspective of satellite photographs. Moreover, the background of satellite images is complex and the object area is small; as a result, small objects tend to be missing due to the challenge of feature extraction. Dense objects overlap and occlusion also affects the detection performance. Although the self-attention mechanism was introduced to detect small objects, the computational complexity increased with the image's resolution. We modified the general one-stage detector YOLOv5 to adapt the satellite images to resolve the above problems. First, new feature fusion layers and a prediction head are added from the shallow layer for small object detection for the first time because it can maximally preserve the feature information. Second, the original convolutional prediction heads are replaced with Swin Transformer Prediction Heads (SPHs) for the first time. SPH represents an advanced self-attention mechanism whose shifted window design can reduce the computational complexity to linearity. Finally, Normalization-based Attention Modules (NAMs) are integrated into YOLOv5 to improve attention performance in a normalized way. The improved YOLOv5 is termed SPH-YOLOv5. It is evaluated on the NWPU-VHR10 dataset and DOTA dataset, which are widely used for satellite image object detection evaluations. Compared with the basal YOLOv5, SPH-YOLOv5 improves the mean Average Precision (mAP) by 0.071 on the DOTA dataset.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] SF-YOLOv5: Improved YOLOv5 with swin transformer and fusion-concat method for multi-UAV detection
    Ma, Jun
    Wang, Xiao
    Xu, Cuifeng
    Ling, Jing
    MEASUREMENT & CONTROL, 2023, 56 (7-8): : 1436 - 1445
  • [32] Small object detection in UAV image based on improved YOLOv5
    Zhang, Jian
    Wan, Guoyang
    Jiang, Ming
    Lu, Guifu
    Tao, Xiuwen
    Huang, Zhiyuan
    SYSTEMS SCIENCE & CONTROL ENGINEERING, 2023, 11 (01)
  • [33] Improved YOLOv5 Small Object Detection Algorithm in Moving Scenes
    Zhu, Ruixin
    Yang, Fuxing
    Computer Engineering and Applications, 2023, 59 (10): : 196 - 203
  • [34] A novel small object detection algorithm for UAVs based on YOLOv5
    Li, Jianzhuang
    Zhang, Yuechong
    Liu, Haiying
    Guo, Junmei
    Liu, Lida
    Gu, Jason
    Deng, Lixia
    Li, Shuang
    PHYSICA SCRIPTA, 2024, 99 (03)
  • [35] Detection of underwater treasures using attention mechanism and improved YOLOv5
    Lin S.
    Liu M.
    Tao Z.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2021, 37 (18): : 307 - 314
  • [36] Improving YOLOv5 with Attention Mechanism for Detecting Boulders from Planetary Images
    Zhu, Linlin
    Geng, Xun
    Li, Zheng
    Liu, Chun
    REMOTE SENSING, 2021, 13 (18)
  • [37] ASG-YOLOv5: Improved YOLOv5 unmanned aerial vehicle remote sensing aerial images scenario for small object detection based on attention and spatial gating
    Shi, Houwang
    Yang, Wenzhong
    Chen, Danni
    Wang, Min
    PLOS ONE, 2024, 19 (06):
  • [38] Similarity Mask Mixed Attention for YOLOv5 Small Ship Detection of Optical Remote Sensing Images
    Zhang, Xiaowen
    Yuan, Shuai
    Luan, Fangjun
    Lv, Jiaqi
    Liu, Guifu
    2022 WRC SYMPOSIUM ON ADVANCED ROBOTICS AND AUTOMATION, WRC SARA, 2022, : 263 - 268
  • [39] FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection
    Wang, Min
    Yang, Wenzhong
    Wang, Liejun
    Chen, Danny
    Wei, Fuyuan
    KeZiErBieKe, HaiLaTi
    Liao, Yuanyuan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [40] An Improved Underwater Object Detection Algorithm Based on YOLOv5 for Blurry Images
    Cheng, Liyan
    Zhou, Hui
    Le, Xingni
    Chen, Wanru
    Tao, Hechuan
    Ding, Jiarui
    Wang, Xinru
    Wang, Ruizhi
    Yang, Qunhui
    Chen, Chen
    Kong, Meiwei
    2024 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND WIRELESS OPTICAL COMMUNICATIONS, ICWOC, 2024, : 42 - 47