RQFormer: Rotated Query Transformer for end-to-end oriented object detection

被引:0
|
作者
Zhao, Jiaqi [1 ,2 ,3 ]
Ding, Zeyu [1 ,2 ]
Zhou, Yong [1 ,2 ]
Zhu, Hancheng [1 ,2 ]
Du, Wen-Liang [1 ,2 ]
Yao, Rui [1 ,2 ]
El Saddik, Abdulmotaleb [4 ]
机构
[1] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R China
[2] Minist Educ, Mine Digitizat Engn Res Ctr, Xuzhou 221116, Peoples R China
[3] Innovat Res Ctr Disaster Intelligent Prevent & Eme, Xuzhou 221116, Peoples R China
[4] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada
基金
中国国家自然科学基金;
关键词
Oriented object detection; Transformer; End-to-end detectors; Attention; Query update;
D O I
10.1016/j.eswa.2024.126034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Oriented object detection presents a challenging task due to the presence of object instances with multiple orientations, varying scales, and dense distributions. Recently, end-to-end detectors have made significant strides by employing attention mechanisms and refining a fixed number of queries through consecutive decoder layers. However, existing end-to-end oriented object detectors still face two primary challenges: (1) misalignment between positional queries and keys, leading to inconsistency between classification and localization; and (2) the presence of a large number of similar queries, which complicates one-to-one label assignments and optimization. To address these limitations, we propose an end-to-end oriented detector called the Rotated Query Transformer, which integrates two key technologies: Rotated RoI Attention (RRoI Attention) and Selective Distinct Queries (SDQ). First, RRoI Attention aligns positional queries and keys from oriented regions of interest through cross-attention. Second, SDQ collects queries from intermediate decoder layers and filters out similar ones to generate distinct queries, thereby facilitating the optimization of one-to-one label assignments. Finally, extensive experiments conducted on four remote sensing datasets and one scene text dataset demonstrate the effectiveness of our method. To further validate its generalization capability, we also extend our approach to horizontal object detection. The code is available at https://github.com/ wokaikaixinxin/RQFormer.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Rotated-DETR: an End-to-End Transformer-based Oriented Object Detector for Aerial Images
    Kim, Jinbeom
    Lee, Giljun
    Kim, Taejune
    Woo, Simon S.
    38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 1248 - 1255
  • [2] EOOD: End-to-end oriented object detection
    Zhang, Caiguang
    Chen, Zilong
    Xiong, Boli
    Ji, Kefeng
    Kuang, Gangyao
    NEUROCOMPUTING, 2025, 621
  • [3] Dense Distinct Query for End-to-End Object Detection
    Zhang, Shilong
    Wang, Xinjiang
    Wang, Jiaqi
    Pang, Jiangmiao
    Lyu, Chengqi
    Zhang, Wenwei
    Luo, Ping
    Chen, Kai
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7329 - 7338
  • [4] SRDD: a lightweight end-to-end object detection with transformer
    Zhu, Yuan
    Xia, Qingyuan
    Jin, Wen
    CONNECTION SCIENCE, 2022, 34 (01) : 2448 - 2465
  • [5] End-to-End Human Object Interaction Detection with HOI Transformer
    Zou, Cheng
    Wang, Bohan
    Hu, Yue
    Liu, Junqi
    Wu, Qian
    Zhao, Yu
    Li, Boxun
    Zhang, Chenguang
    Zhang, Chi
    Wei, Yichen
    Sun, Jian
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11820 - 11829
  • [6] Transformer-based End-to-End Object Detection in Aerial Images
    Vo, Nguyen D.
    Le, Nguyen
    Ngo, Giang
    Doan, Du
    Le, Do
    Nguyen, Khang
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (10) : 1072 - 1079
  • [7] V-DETR: Pure Transformer for End-to-End Object Detection
    Dung Nguyen
    Van-Dung Hoang
    Van-Tuong-Lan Le
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, ACIIDS 2024, 2024, 14796 : 120 - 131
  • [8] An End-to-End Transformer Model for 3D Object Detection
    Misra, Ishan
    Girdhar, Rohit
    Joulin, Armand
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2886 - 2897
  • [9] RESC: REfine the SCore with adaptive transformer head for end-to-end object detection
    Wang, Honglie
    Jiang, Rong
    Xu, Jian
    Sun, Shouqian
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (14): : 12017 - 12028
  • [10] RESC: REfine the SCore with adaptive transformer head for end-to-end object detection
    Honglie Wang
    Rong Jiang
    Jian Xu
    Shouqian Sun
    Neural Computing and Applications, 2022, 34 : 12017 - 12028