DKTNet: Dual-Key Transformer Network for small object detection

被引:28
|
作者
Xu, Shoukun [1 ]
Gu, Jianan [1 ]
Hua, Yining [2 ]
Liu, Yi [1 ]
机构
[1] Changzhou Univ, Changzhou 213164, Jiangsu, Peoples R China
[2] Univ Aberdeen, Aberdeen, Scotland
基金
中国国家自然科学基金;
关键词
Small object detection; Transformer; Dual-key;
D O I
10.1016/j.neucom.2023.01.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection is a fundamental computer vision task that plays a crucial role in a wide range of real-world applications. However, it is still a challenging task to detect the small size objects in the complex scene, due to the low resolution and noisy representation appearance caused by occlusion, distant depth view, etc. To tackle this issue, a novel transformer architecture, Dual-Key Transformer Network (DKTNet), is proposed in this paper. To improve the feature attention ability, the coherence of linear layer outputs Q and V are enhanced by a dual-K integrated from K1 and K2, which are computed along Q and V, respectively. Instead of spatial-wise attention, channel-wise self-attention mechanism is adopted to promote the important feature channels and suppress the confusing ones. Moreover, 2D and 1D convolution computations for Q, K and V are proposed. Compared with the fully-connected computa-tion in conventional transformer architectures, the 2D convolution can better capture local details and global contextual information, and the 1D convolution can reduce network complexity significantly. Experimental evaluation is conducted on both general and small object detection datasets. The superior-ity of the aforementioned features in our proposed approach is demonstrated with the comparison against the state-of-the-art approaches.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:29 / 41
页数:13
相关论文
共 50 条
  • [1] Dual-key strategy
    Toshiki Itoh
    Pietro De Camilli
    Nature, 2004, 429 : 141 - 143
  • [2] Membrane trafficking - Dual-key strategy
    Itoh, T
    De Camilli, P
    NATURE, 2004, 429 (6988) : 141 - 143
  • [3] Interactive Transformer for Small Object Detection
    Wei, Jian
    Wang, Qinzhao
    Zhao, Zixu
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 77 (02): : 1699 - 1717
  • [4] Dual-Key Multimodal Backdoors for Visual Question Answering
    Walmer, Matthew
    Sikka, Karan
    Sur, Indranil
    Shrivastava, Abhinav
    Jha, Susmit
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15354 - 15364
  • [5] Small Object Detection for Birds with Swin Transformer
    Huo, Da
    Kastner, Marc A.
    Liu, Tingwei
    Kawanishi, Yasutomo
    Hirayama, Takatsugu
    Komamizu, Takahiro
    Ide, Ichiro
    2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
  • [6] Dual-Key Multimodal Backdoors for Visual Question Answering
    Walmer, Matthew
    Sikka, Karan
    Sur, Indranil
    Shrivastava, Abhinav
    Jha, Susmit
    arXiv, 2021,
  • [7] A Key Distribution Scheme for WSN Based on Deployment Knowledge and Dual-Key Pools
    You, Lin
    Yuan, Younan
    INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2014, 8 (04): : 1 - 15
  • [8] Transformer-CNN for small image object detection
    Chen, Yan-Lin
    Lin, Chun-Liang
    Lin, Yu-Chen
    Chen, Tzu-Chun
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2024, 129
  • [9] Collaborative compensative transformer network for salient object detection
    Chen, Jun
    Zhang, Heye
    Gong, Mingming
    Gao, Zhifan
    PATTERN RECOGNITION, 2024, 154
  • [10] TNOD: Transformer Network with Object Detection for Tag Recommendation
    Feng, Kai
    Liu, Tao
    Zhang, Heng
    Meng, Zihao
    Miao, Zemin
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 617 - 621