RQFormer: Rotated Query Transformer for end-to-end oriented object detection

被引:0
|
作者
Zhao, Jiaqi [1 ,2 ,3 ]
Ding, Zeyu [1 ,2 ]
Zhou, Yong [1 ,2 ]
Zhu, Hancheng [1 ,2 ]
Du, Wen-Liang [1 ,2 ]
Yao, Rui [1 ,2 ]
El Saddik, Abdulmotaleb [4 ]
机构
[1] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R China
[2] Minist Educ, Mine Digitizat Engn Res Ctr, Xuzhou 221116, Peoples R China
[3] Innovat Res Ctr Disaster Intelligent Prevent & Eme, Xuzhou 221116, Peoples R China
[4] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada
基金
中国国家自然科学基金;
关键词
Oriented object detection; Transformer; End-to-end detectors; Attention; Query update;
D O I
10.1016/j.eswa.2024.126034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Oriented object detection presents a challenging task due to the presence of object instances with multiple orientations, varying scales, and dense distributions. Recently, end-to-end detectors have made significant strides by employing attention mechanisms and refining a fixed number of queries through consecutive decoder layers. However, existing end-to-end oriented object detectors still face two primary challenges: (1) misalignment between positional queries and keys, leading to inconsistency between classification and localization; and (2) the presence of a large number of similar queries, which complicates one-to-one label assignments and optimization. To address these limitations, we propose an end-to-end oriented detector called the Rotated Query Transformer, which integrates two key technologies: Rotated RoI Attention (RRoI Attention) and Selective Distinct Queries (SDQ). First, RRoI Attention aligns positional queries and keys from oriented regions of interest through cross-attention. Second, SDQ collects queries from intermediate decoder layers and filters out similar ones to generate distinct queries, thereby facilitating the optimization of one-to-one label assignments. Finally, extensive experiments conducted on four remote sensing datasets and one scene text dataset demonstrate the effectiveness of our method. To further validate its generalization capability, we also extend our approach to horizontal object detection. The code is available at https://github.com/ wokaikaixinxin/RQFormer.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Progressive End-to-End Object Detection in Crowded Scenes
    Zheng, Anlin
    Zhang, Yuang
    Zhang, Xiangyu
    Qi, Xiaojuan
    Sun, Jian
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 847 - 856
  • [32] Toward End-to-End Object Detection and Tracking on the Edge
    Tabkhi, Hamed
    SEC 2017: 2017 THE SECOND ACM/IEEE SYMPOSIUM ON EDGE COMPUTING (SEC'17), 2017,
  • [33] End-to-End Edge Neuromorphic Object Detection System
    Silva, D. A.
    Shymyrbay, A.
    Smagulova, K.
    Elsheikh, A.
    Fouda, M. E.
    Eltawil, A. M.
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 194 - 198
  • [34] NucDETR: End-to-End Transformer for Nucleus Detection in Histopathology Images
    Obeid, Ahmad
    Mahbub, Taslim
    Javed, Sajid
    Dias, Jorge
    Werghi, Naoufel
    COMPUTATIONAL MATHEMATICS MODELING IN CANCER ANALYSIS, CMMCA 2022, 2022, 13574 : 47 - 57
  • [35] TSDet: End-to-End Method with Transformer for SAR Ship Detection
    Chen, Yanyu
    Xia, Zhihao
    Liu, Jian
    Wu, Chenwei
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [36] End-to-end power equipment detection and localization with RM transformer
    Fang, Jian
    Wang, Youyuan
    Chen, Weigen
    IET GENERATION TRANSMISSION & DISTRIBUTION, 2022, 16 (19) : 3941 - 3950
  • [37] Enhancing Arabic Cyberbullying Detection with End-to-End Transformer Model
    Mahdi, Mohamed A.
    Fati, Suliman Mohamed
    Hazber, Mohamed A. G.
    Ahamad, Shahanawaj
    Saad, Sawsan A.
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2024, 141 (02): : 1651 - 1671
  • [38] End-to-End Query Term Weighting
    Samel, Karan
    Li, Cheng
    Kong, Weize
    Chen, Tao
    Zhang, Mingyang
    Gupta, Shaleen
    Khadanga, Swaraj
    Xu, Wensong
    Wang, Xingyu
    Kolipaka, Kashyap
    Bendersky, Michael
    Najork, Marc
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 4778 - 4786
  • [39] A Small-Target Forest Fire Smoke Detection Model Based on Deformable Transformer for End-to-End Object Detection
    Huang, Jingwen
    Zhou, Jiashun
    Yang, Huizhou
    Liu, Yunfei
    Liu, Han
    FORESTS, 2023, 14 (01):
  • [40] Deeply Tensor Compressed Transformers for End-to-End Object Detection
    Zhen, Peining
    Gao, Ziyang
    Hou, Tianshu
    Cheng, Yuan
    Chen, Hai-Bao
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4716 - 4724