DKTNet: Dual-Key Transformer Network for small object detection

被引：28

作者：

Xu, Shoukun ^{[1
]}

Gu, Jianan ^{[1
]}

Hua, Yining ^{[2
]}

Liu, Yi ^{[1
]}

机构：

[1] Changzhou Univ, Changzhou 213164, Jiangsu, Peoples R China

[2] Univ Aberdeen, Aberdeen, Scotland

来源：

NEUROCOMPUTING | 2023年 / 525卷

基金：

中国国家自然科学基金;

关键词：

Small object detection; Transformer; Dual-key;

D O I：

10.1016/j.neucom.2023.01.055

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detection is a fundamental computer vision task that plays a crucial role in a wide range of real-world applications. However, it is still a challenging task to detect the small size objects in the complex scene, due to the low resolution and noisy representation appearance caused by occlusion, distant depth view, etc. To tackle this issue, a novel transformer architecture, Dual-Key Transformer Network (DKTNet), is proposed in this paper. To improve the feature attention ability, the coherence of linear layer outputs Q and V are enhanced by a dual-K integrated from K1 and K2, which are computed along Q and V, respectively. Instead of spatial-wise attention, channel-wise self-attention mechanism is adopted to promote the important feature channels and suppress the confusing ones. Moreover, 2D and 1D convolution computations for Q, K and V are proposed. Compared with the fully-connected computa-tion in conventional transformer architectures, the 2D convolution can better capture local details and global contextual information, and the 1D convolution can reduce network complexity significantly. Experimental evaluation is conducted on both general and small object detection datasets. The superior-ity of the aforementioned features in our proposed approach is demonstrated with the comparison against the state-of-the-art approaches.(c) 2023 Elsevier B.V. All rights reserved.

引用

页码：29 / 41

页数：13

共 50 条

[41] Dual Attention Based Image Pyramid Network for Object Detection
Dong, Xiang
Li, Feng
Bai, Huihui
Zhao, Yao
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (12): : 4439 - 4455
[42] DTSSD: Dual-Channel Transformer-Based Network for Point-Based 3D Object Detection
Zheng, Zhijie
Huang, Zhicong
Zhao, Jingwen
Hu, Haifeng
Chen, Dihu
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 798 - 802
[43] DIG: dual interaction and guidance network for salient object detection
Ning Jia
Yufei Chen
Xianhui Liu
Hui Wang
Applied Intelligence, 2023, 53 : 28039 - 28053
[44] HTDet: A Hybrid Transformer-Based Approach for Underwater Small Object Detection
Chen, Gangqi
Mao, Zhaoyong
Wang, Kai
Shen, Junge
REMOTE SENSING, 2023, 15 (04)
[45] Continual Detection Transformer for Incremental Object Detection
Liu, Yaoyao
Schiele, Bernt
Vedaldi, Andrea
Rupprecht, Christian
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23799 - 23808
[46] Image attention transformer network for indoor 3D object detection
REN KeYan
YAN Tong
HU ZhaoXin
HAN HongGui
ZHANG YunLu
Science China(Technological Sciences), 2024, 67 (07) : 2176 - 2190
[47] Two-Stage Underwater Object Detection Network Using Swin Transformer
Liu, Jia
Liu, Shuang
Xu, Shujuan
Zhou, Changjun
IEEE ACCESS, 2022, 10 : 117235 - 117247
[48] Transformer-based Cross Reference Network for video salient object detection
Huang, Kan
Tian, Chunwei
Su, Jingyong
Lin, Jerry Chun-Wei
PATTERN RECOGNITION LETTERS, 2022, 160 : 122 - 127
[49] DFTNet: Dual Flow Transformer Network for Conveyor Belt Edge Detection
Yang, Zhifang
Zhang, Liya
Hao, Bonan
Li, Biao
Zhang, Tianxiang
UNMANNED SYSTEMS, 2024, 12 (05) : 877 - 885
[50] DTT: A Dual-domain Transformer model for Network Intrusion Detection
Xu, Chenjian
Sun, Weirui
Li, Mengxue
EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (06):

← 1 2 3 4 5 →