CTOD: Cross-Attentive Task-Alignment for One-Stage Object Detection

被引:0
|
作者
Yao, Ruilin [1 ]
Rong, Yi [1 ,2 ,3 ]
Huang, Qiangqiang [1 ]
Xiong, Shengwu [1 ,2 ,3 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
[2] Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
基金
中国国家自然科学基金;
关键词
One-stage object detection; task-alignment; cross-attention; spatial feature aggregation;
D O I
10.1109/TCSVT.2024.3422879
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Existing one-stage object detectors are commonly implemented in a multi-task learning based manner, which simultaneously solves two different sub-tasks: object classification and localization. To achieve this, the detection heads with two independent branches are typically utilized to extract specific image features for each task separately. However, due to the lack of interaction between the parallel branches, the difference in learning objectives of classification and localization will lead to spatial misalignment between the predictions of these two tasks. In this work, we propose a novel Cross-attentive Task-aligned Object Detection (CTOD) method to handle this problem by explicitly promoting the prediction consistency for both tasks. Specifically, we first design a Dual Task Interaction (DTI) module, which generates task-interactive embeddings for each branch from task-specific features by using a task cross-attention mechanism. Then based on these embeddings, we propose a Spatial Feature Aggregation (SFA) module that calculates offsets and weights to aggregate information from nearby feature points at each spatial location of the task-specific feature maps. Meanwhile, we also generate adjustment parameters from the task-interactive embeddings to finally align the prediction results of the two tasks obtained from the enhanced task-specific features described above. Extensive experiments are conducted on the MS-COCO dataset. When using ResNeXt-101-64x4d-DCN as the backbone, our CTOD method achieves a detection result of 51.8 AP with single-model and single-scale testing, outperforming the recently proposed one-stage detectors ATSS, VFNet, LD and TOOD by 4.1, 1.9, 1.3 and 0.7 AP, respectively. The analysis of qualitative results also illustrates the effectiveness and superiority of CTOD in solving the task misalignment problem for object detection. Our code is available at https://github.com/Mr-Bigworth/CTOD.
引用
收藏
页码:11507 / 11520
页数:14
相关论文
共 50 条
  • [41] EYOLOX: An Efficient One-Stage Object Detection Network Based on YOLOX
    Tang, Rui
    Sun, Hui
    Liu, Di
    Xu, Hui
    Qi, Miao
    Kong, Jun
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [42] Multi-category solar radio burst detection based on task-aligned one-stage object detection model
    Wang, Mingming
    Yuan, Guowu
    He, Hailan
    Tan, Chengming
    Wu, Hao
    Zhou, Hao
    ASTROPHYSICS AND SPACE SCIENCE, 2025, 370 (03)
  • [43] One-Stage Lightweight Network of Object Detection for Rectangular Panoramic Images
    Lu, Yingying
    Tie, Yun
    Qi, Lin
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VII, ICIC 2024, 2024, 14868 : 390 - 401
  • [44] DualHead for One-stage Object Detection Networks with Receptive Field Enhancement
    Wang, Shaohua
    Dai, Yaping
    Shao, Shuai
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 6666 - 6671
  • [45] Vehicle Detection in Overhead Satellite Images Using a One-Stage Object Detection Model
    Stuparu, Delia-Georgiana
    Ciobanu, Radu-Ioan
    Dobre, Ciprian
    SENSORS, 2020, 20 (22) : 1 - 18
  • [46] One-Stage Object Referring with Gaze Estimation
    Chen, Jianhang
    Zhang, Xu
    Wu, Yue
    Ghosh, Shalini
    Natarajan, Pradeep
    Chang, Shih-Fu
    Allebach, Jan
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 5017 - 5026
  • [47] ODEE: A One-Stage Object Detection Framework for Overlapping and Nested Event Extraction
    Ning, Jinzhong
    Yang, Zhihao
    Wang, Zhizheng
    Sun, Yuanyuan
    Lin, Hongfei
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5170 - 5178
  • [48] Generative and self-supervised domain adaptation for one-stage object detection
    Fujii, Kazuma
    Kawamoto, Kazuhiko
    ARRAY, 2021, 11
  • [49] Refined One-Stage Oriented Object Detection Method for Remote Sensing Images
    Hou, Liping
    Lu, Ke
    Xue, Jian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1545 - 1558
  • [50] One-stage object detection networks for inspecting the surface defects of magnetic tiles
    Wei, Jiaqi
    Zhu, Peiyuan
    Qian, Xiang
    Zhu, Shidong
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGING SYSTEMS & TECHNIQUES (IST 2019), 2019,