DKTNet: Dual-Key Transformer Network for small object detection

被引:28
|
作者
Xu, Shoukun [1 ]
Gu, Jianan [1 ]
Hua, Yining [2 ]
Liu, Yi [1 ]
机构
[1] Changzhou Univ, Changzhou 213164, Jiangsu, Peoples R China
[2] Univ Aberdeen, Aberdeen, Scotland
基金
中国国家自然科学基金;
关键词
Small object detection; Transformer; Dual-key;
D O I
10.1016/j.neucom.2023.01.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection is a fundamental computer vision task that plays a crucial role in a wide range of real-world applications. However, it is still a challenging task to detect the small size objects in the complex scene, due to the low resolution and noisy representation appearance caused by occlusion, distant depth view, etc. To tackle this issue, a novel transformer architecture, Dual-Key Transformer Network (DKTNet), is proposed in this paper. To improve the feature attention ability, the coherence of linear layer outputs Q and V are enhanced by a dual-K integrated from K1 and K2, which are computed along Q and V, respectively. Instead of spatial-wise attention, channel-wise self-attention mechanism is adopted to promote the important feature channels and suppress the confusing ones. Moreover, 2D and 1D convolution computations for Q, K and V are proposed. Compared with the fully-connected computa-tion in conventional transformer architectures, the 2D convolution can better capture local details and global contextual information, and the 1D convolution can reduce network complexity significantly. Experimental evaluation is conducted on both general and small object detection datasets. The superior-ity of the aforementioned features in our proposed approach is demonstrated with the comparison against the state-of-the-art approaches.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:29 / 41
页数:13
相关论文
共 50 条
  • [41] Dual Attention Based Image Pyramid Network for Object Detection
    Dong, Xiang
    Li, Feng
    Bai, Huihui
    Zhao, Yao
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (12): : 4439 - 4455
  • [42] DTSSD: Dual-Channel Transformer-Based Network for Point-Based 3D Object Detection
    Zheng, Zhijie
    Huang, Zhicong
    Zhao, Jingwen
    Hu, Haifeng
    Chen, Dihu
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 798 - 802
  • [43] DIG: dual interaction and guidance network for salient object detection
    Ning Jia
    Yufei Chen
    Xianhui Liu
    Hui Wang
    Applied Intelligence, 2023, 53 : 28039 - 28053
  • [44] HTDet: A Hybrid Transformer-Based Approach for Underwater Small Object Detection
    Chen, Gangqi
    Mao, Zhaoyong
    Wang, Kai
    Shen, Junge
    REMOTE SENSING, 2023, 15 (04)
  • [45] Continual Detection Transformer for Incremental Object Detection
    Liu, Yaoyao
    Schiele, Bernt
    Vedaldi, Andrea
    Rupprecht, Christian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23799 - 23808
  • [46] Image attention transformer network for indoor 3D object detection
    REN KeYan
    YAN Tong
    HU ZhaoXin
    HAN HongGui
    ZHANG YunLu
    Science China(Technological Sciences), 2024, 67 (07) : 2176 - 2190
  • [47] Two-Stage Underwater Object Detection Network Using Swin Transformer
    Liu, Jia
    Liu, Shuang
    Xu, Shujuan
    Zhou, Changjun
    IEEE ACCESS, 2022, 10 : 117235 - 117247
  • [48] Transformer-based Cross Reference Network for video salient object detection
    Huang, Kan
    Tian, Chunwei
    Su, Jingyong
    Lin, Jerry Chun-Wei
    PATTERN RECOGNITION LETTERS, 2022, 160 : 122 - 127
  • [49] DFTNet: Dual Flow Transformer Network for Conveyor Belt Edge Detection
    Yang, Zhifang
    Zhang, Liya
    Hao, Bonan
    Li, Biao
    Zhang, Tianxiang
    UNMANNED SYSTEMS, 2024, 12 (05) : 877 - 885
  • [50] DTT: A Dual-domain Transformer model for Network Intrusion Detection
    Xu, Chenjian
    Sun, Weirui
    Li, Mengxue
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (06):