RQFormer: Rotated Query Transformer for end-to-end oriented object detection

被引:0
|
作者
Zhao, Jiaqi [1 ,2 ,3 ]
Ding, Zeyu [1 ,2 ]
Zhou, Yong [1 ,2 ]
Zhu, Hancheng [1 ,2 ]
Du, Wen-Liang [1 ,2 ]
Yao, Rui [1 ,2 ]
El Saddik, Abdulmotaleb [4 ]
机构
[1] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R China
[2] Minist Educ, Mine Digitizat Engn Res Ctr, Xuzhou 221116, Peoples R China
[3] Innovat Res Ctr Disaster Intelligent Prevent & Eme, Xuzhou 221116, Peoples R China
[4] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada
基金
中国国家自然科学基金;
关键词
Oriented object detection; Transformer; End-to-end detectors; Attention; Query update;
D O I
10.1016/j.eswa.2024.126034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Oriented object detection presents a challenging task due to the presence of object instances with multiple orientations, varying scales, and dense distributions. Recently, end-to-end detectors have made significant strides by employing attention mechanisms and refining a fixed number of queries through consecutive decoder layers. However, existing end-to-end oriented object detectors still face two primary challenges: (1) misalignment between positional queries and keys, leading to inconsistency between classification and localization; and (2) the presence of a large number of similar queries, which complicates one-to-one label assignments and optimization. To address these limitations, we propose an end-to-end oriented detector called the Rotated Query Transformer, which integrates two key technologies: Rotated RoI Attention (RRoI Attention) and Selective Distinct Queries (SDQ). First, RRoI Attention aligns positional queries and keys from oriented regions of interest through cross-attention. Second, SDQ collects queries from intermediate decoder layers and filters out similar ones to generate distinct queries, thereby facilitating the optimization of one-to-one label assignments. Finally, extensive experiments conducted on four remote sensing datasets and one scene text dataset demonstrate the effectiveness of our method. To further validate its generalization capability, we also extend our approach to horizontal object detection. The code is available at https://github.com/ wokaikaixinxin/RQFormer.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] End-To-End High-Quality Transformer Object Detection Model Applied to Human Head Detection
    Zhou, Zhen
    Li, Rongchun
    Qiao, Peng
    Jiang, Jingfei
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XII, 2025, 15042 : 404 - 417
  • [22] SText-DETR: End-to-End Arbitrary-Shaped Text Detection with Scalable Query in Transformer
    Liao, Pujin
    Wang, Zengfu
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 481 - 492
  • [23] Transformer Based End-to-End Mispronunciation Detection and Diagnosis
    Wu, Minglin
    Li, Kun
    Leung, Wai-Kim
    Meng, Helen
    INTERSPEECH 2021, 2021, : 3954 - 3958
  • [24] MGTR: End-to-End Mutual Gaze Detection with Transformer
    Guo, Hang
    Hu, Zhengxi
    Liu, Jingtai
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 363 - 378
  • [25] MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection
    Kim, Bumsoo
    Mun, Jonghwan
    On, Kyoung-Woon
    Shin, Minchul
    Lee, Junhyun
    Kim, Eun-Sol
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19556 - 19565
  • [26] Deep interactive query design and progressive search for end-to-end detection of tiny object in aerial images
    Jin, Chuan
    Zheng, Anqi
    Wu, Zhaoying
    Tong, Changqing
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025,
  • [27] DeoT: an end-to-end encoder-only Transformer object detector
    Ding, Tonghe
    Feng, Kaili
    Wei, Yanjun
    Han, Yu
    Li, Tianping
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (01)
  • [28] DeoT: an end-to-end encoder-only Transformer object detector
    Tonghe Ding
    Kaili Feng
    Yanjun Wei
    Yu Han
    Tianping Li
    Journal of Real-Time Image Processing, 2023, 20
  • [29] End-to-end Deep Object Tracking with Circular Loss Function for Rotated Bounding Box
    Belyaev, Vladislav
    Malysheva, Aleksandra
    Shpilman, Aleksei
    2019 XVI INTERNATIONAL SYMPOSIUM PROBLEMS OF REDUNDANCY IN INFORMATION AND CONTROL SYSTEMS (REDUNDANCY), 2019, : 165 - 170
  • [30] End-to-End Object Detection with Fully Convolutional Network
    Wang, Jianfeng
    Song, Lin
    Li, Zeming
    Sun, Hongbin
    Sun, Jian
    Zheng, Nanning
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15844 - 15853