RQFormer: Rotated Query Transformer for end-to-end oriented object detection

被引：0

作者：

Zhao, Jiaqi ^{[1
,2
,3
]}

Ding, Zeyu ^{[1
,2
]}

Zhou, Yong ^{[1
,2
]}

Zhu, Hancheng ^{[1
,2
]}

Du, Wen-Liang ^{[1
,2
]}

Yao, Rui ^{[1
,2
]}

El Saddik, Abdulmotaleb ^{[4
]}

机构：

[1] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R China

[2] Minist Educ, Mine Digitizat Engn Res Ctr, Xuzhou 221116, Peoples R China

[3] Innovat Res Ctr Disaster Intelligent Prevent & Eme, Xuzhou 221116, Peoples R China

[4] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2025年 / 266卷

基金：

中国国家自然科学基金;

关键词：

Oriented object detection; Transformer; End-to-end detectors; Attention; Query update;

D O I：

10.1016/j.eswa.2024.126034

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Oriented object detection presents a challenging task due to the presence of object instances with multiple orientations, varying scales, and dense distributions. Recently, end-to-end detectors have made significant strides by employing attention mechanisms and refining a fixed number of queries through consecutive decoder layers. However, existing end-to-end oriented object detectors still face two primary challenges: (1) misalignment between positional queries and keys, leading to inconsistency between classification and localization; and (2) the presence of a large number of similar queries, which complicates one-to-one label assignments and optimization. To address these limitations, we propose an end-to-end oriented detector called the Rotated Query Transformer, which integrates two key technologies: Rotated RoI Attention (RRoI Attention) and Selective Distinct Queries (SDQ). First, RRoI Attention aligns positional queries and keys from oriented regions of interest through cross-attention. Second, SDQ collects queries from intermediate decoder layers and filters out similar ones to generate distinct queries, thereby facilitating the optimization of one-to-one label assignments. Finally, extensive experiments conducted on four remote sensing datasets and one scene text dataset demonstrate the effectiveness of our method. To further validate its generalization capability, we also extend our approach to horizontal object detection. The code is available at https://github.com/ wokaikaixinxin/RQFormer.

引用

页数：14

共 50 条

[21] End-To-End High-Quality Transformer Object Detection Model Applied to Human Head Detection
Zhou, Zhen
Li, Rongchun
Qiao, Peng
Jiang, Jingfei
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XII, 2025, 15042 : 404 - 417
[22] SText-DETR: End-to-End Arbitrary-Shaped Text Detection with Scalable Query in Transformer
Liao, Pujin
Wang, Zengfu
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 481 - 492
[23] Transformer Based End-to-End Mispronunciation Detection and Diagnosis
Wu, Minglin
Li, Kun
Leung, Wai-Kim
Meng, Helen
INTERSPEECH 2021, 2021, : 3954 - 3958
[24] MGTR: End-to-End Mutual Gaze Detection with Transformer
Guo, Hang
Hu, Zhengxi
Liu, Jingtai
COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 363 - 378
[25] MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection
Kim, Bumsoo
Mun, Jonghwan
On, Kyoung-Woon
Shin, Minchul
Lee, Junhyun
Kim, Eun-Sol
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19556 - 19565
[26] Deep interactive query design and progressive search for end-to-end detection of tiny object in aerial images
Jin, Chuan
Zheng, Anqi
Wu, Zhaoying
Tong, Changqing
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025,
[27] DeoT: an end-to-end encoder-only Transformer object detector
Ding, Tonghe
Feng, Kaili
Wei, Yanjun
Han, Yu
Li, Tianping
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (01)
[28] DeoT: an end-to-end encoder-only Transformer object detector
Tonghe Ding
Kaili Feng
Yanjun Wei
Yu Han
Tianping Li
Journal of Real-Time Image Processing, 2023, 20
[29] End-to-end Deep Object Tracking with Circular Loss Function for Rotated Bounding Box
Belyaev, Vladislav
Malysheva, Aleksandra
Shpilman, Aleksei
2019 XVI INTERNATIONAL SYMPOSIUM PROBLEMS OF REDUNDANCY IN INFORMATION AND CONTROL SYSTEMS (REDUNDANCY), 2019, : 165 - 170
[30] End-to-End Object Detection with Fully Convolutional Network
Wang, Jianfeng
Song, Lin
Li, Zeming
Sun, Hongbin
Sun, Jian
Zheng, Nanning
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15844 - 15853

← 1 2 3 4 5 →