DETR Novel Small Target Detection Algorithm Based on Swin Transformer

被引:0
|
作者
Xu, Fengchang [1 ,2 ]
Alfred, Rayner [1 ]
Pailus, Rayner Henry [1 ]
Lyu, Ge [2 ]
Du, Shifeng [2 ]
Chew, Jackel Vui Lung [3 ]
Li, Guozhang [4 ]
Wang, Xinliang [5 ]
机构
[1] Univ Malaysia Sabah, Fac Comp & Informat, Creat Adv Machine Intelligence Res Ctr, Jalan UMS, Kota Kinabalu 88400, Sabah, Malaysia
[2] Shandong Vocat Coll Light Ind, Dept Informat Engn, Zibo 255300, Shandong, Peoples R China
[3] Univ Malaysia Sabah Labuan Int Campus, Fac Comp & Informat, Labuan 87000, Malaysia
[4] Hainan Vocat Univ Sci & Technol, Coll Informat Engn, Haikou 571126, Hainan, Peoples R China
[5] Binzhou Civil Air Def Engn & Command Support Ctr, Binzhou 256600, Shandong, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Transformers; Object detection; Feature extraction; Accuracy; Adaptation models; Computational modeling; YOLO; Deep learning; Swin transformer; DETR; small target detection; deep learning;
D O I
10.1109/ACCESS.2024.3445950
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A small target object refers to an object whose relative size of the bounding box is very small, usually the ratio of the width of the bounding box to the width and height of the original image is less than 0.1, or the ratio of the area of the bounding box to the area of the original image is less than 0.03, or the absolute size is less than 32(& lowast;)32 pixels. It has important applications in industrial defect detection, medical image processing, intelligent security, unmanned driving, and many other fields. Although great progress has been made in the field of target detection, which is limited to large target objects, due to the challenges of small size, inconspicuous features and insufficient data samples, the accuracy and speed of small target detection are low. To solve this problem, this paper proposes a novel small target object detection algorithm model: Swin Transformer's DETR. In this algorithm, Swin Transformer is used as the backbone to extract the global features and local information of small targets, and a three-layer feature pyramid structure is used for feature fusion at the Neck layer to improve the calculation efficiency and model accuracy. Secondly, the detector is optimized, and the detector is replaced by two stages, and the ReLU activation function of FFN layer is replaced by the latest SwiGLU activation function, to avoid the problems of gradient disappearance and explosion and enhance the nonlinearity of the algorithm model. Large resolution size input is adopted on Tiny Person dataset, and its input value is set to [1400,800]. The above analysis is carried out on VOC and Tiny Person datasets, and the detection rates of small target objects are 88.9% and 48.3% respectively. The results show that the Swin Transformer's DETR algorithm model proposed in this paper performs well on various datasets, and has strong generalization ability, stability and accuracy in different scenarios and datasets, which is higher than other algorithm models.
引用
收藏
页码:115838 / 115852
页数:15
相关论文
共 50 条
  • [31] EDSD: efficient driving scenes detection based on Swin Transformer
    Wei Chen
    Ruihan Zheng
    Jiade Jiang
    Zijian Tian
    Fan Zhang
    Yi Liu
    Multimedia Tools and Applications, 2024, 83 (39) : 87179 - 87198
  • [32] Anchor DETR: Query Design for Transformer-Based Object Detection
    Wang, Yingming
    Zhang, Xiangyu
    Yang, Tong
    Sun, Jian
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2567 - 2575
  • [33] Swin transformer based vehicle detection in undisciplined traffic environment
    Deshmukh, Prashant
    Satyanarayana, G. S. R.
    Majhi, Sudhan
    Sahoo, Upendra Kumar
    Das, Santos Kumar
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [34] Remote Sensing Object Detection Based on Convolution and Swin Transformer
    Jiang, Xuzhao
    Wu, Yonghong
    IEEE ACCESS, 2023, 11 : 38643 - 38656
  • [35] Deep learning-based bubble detection with swin transformer
    Uesawa, Shinichiro
    Yoshida, Hiroyuki
    JOURNAL OF NUCLEAR SCIENCE AND TECHNOLOGY, 2024, 61 (11) : 1438 - 1452
  • [36] One-Stage Detection Model Based on Swin Transformer
    Kim, Tae Yang
    Niaz, Asim
    Choi, Jung Sik
    Choi, Kwang Nam
    IEEE ACCESS, 2024, 12 : 60960 - 60972
  • [37] A Swin Transformer-Based Approach for Motorcycle Helmet Detection
    Bouhayane, Ayyoub
    Charouh, Zakaria
    Ghogho, Mounir
    Guennoun, Zouhair
    IEEE ACCESS, 2023, 11 : 74410 - 74419
  • [38] Crater-DETR: A Novel Transformer Network for Crater Detection Based on Dense Supervision and Multiscale Fusion
    Guo, Yue
    Wu, Hao
    Yang, Shuojin
    Cai, Zhanchuan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 12
  • [39] Recurrent DETR: Transformer-Based Object Detection for Crowded Scenes
    Choi, Hyeong Kyu
    Paik, Chong Keun
    Ko, Hyun Woo
    Park, Min-Chul
    Kim, Hyunwoo J.
    IEEE ACCESS, 2023, 11 : 78623 - 78643
  • [40] Strawberry ripeness detection based on YOLOv8 algorithm fused with LW-Swin Transformer
    Yang, Shizhong
    Wang, Wei
    Gao, Sheng
    Deng, Zhaopeng
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 215