CTOD: Cross-Attentive Task-Alignment for One-Stage Object Detection

被引:0
|
作者
Yao, Ruilin [1 ]
Rong, Yi [1 ,2 ,3 ]
Huang, Qiangqiang [1 ]
Xiong, Shengwu [1 ,2 ,3 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
[2] Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
基金
中国国家自然科学基金;
关键词
One-stage object detection; task-alignment; cross-attention; spatial feature aggregation;
D O I
10.1109/TCSVT.2024.3422879
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Existing one-stage object detectors are commonly implemented in a multi-task learning based manner, which simultaneously solves two different sub-tasks: object classification and localization. To achieve this, the detection heads with two independent branches are typically utilized to extract specific image features for each task separately. However, due to the lack of interaction between the parallel branches, the difference in learning objectives of classification and localization will lead to spatial misalignment between the predictions of these two tasks. In this work, we propose a novel Cross-attentive Task-aligned Object Detection (CTOD) method to handle this problem by explicitly promoting the prediction consistency for both tasks. Specifically, we first design a Dual Task Interaction (DTI) module, which generates task-interactive embeddings for each branch from task-specific features by using a task cross-attention mechanism. Then based on these embeddings, we propose a Spatial Feature Aggregation (SFA) module that calculates offsets and weights to aggregate information from nearby feature points at each spatial location of the task-specific feature maps. Meanwhile, we also generate adjustment parameters from the task-interactive embeddings to finally align the prediction results of the two tasks obtained from the enhanced task-specific features described above. Extensive experiments are conducted on the MS-COCO dataset. When using ResNeXt-101-64x4d-DCN as the backbone, our CTOD method achieves a detection result of 51.8 AP with single-model and single-scale testing, outperforming the recently proposed one-stage detectors ATSS, VFNet, LD and TOOD by 4.1, 1.9, 1.3 and 0.7 AP, respectively. The analysis of qualitative results also illustrates the effectiveness and superiority of CTOD in solving the task misalignment problem for object detection. Our code is available at https://github.com/Mr-Bigworth/CTOD.
引用
收藏
页码:11507 / 11520
页数:14
相关论文
共 50 条
  • [11] CrabNet: Fully Task-Specific Feature Learning for One-Stage Object Detection
    Wang, Hao
    Wang, Qilong
    Zhang, Hongzhi
    Hu, Qinghua
    Zuo, Wangmeng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 2962 - 2974
  • [12] A compression pipeline for one-stage object detection model
    Li, Zhishan
    Sun, Yiran
    Tian, Guanzhong
    Xie, Lei
    Liu, Yong
    Su, Hongye
    He, Yifan
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2021, 18 (06) : 1949 - 1962
  • [13] One-Stage Object Detection with Graph Convolutional Networks
    Du, Lijun
    Sun, Xin
    Dong, Junyu
    TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
  • [14] FCOS: Fully Convolutional One-Stage Object Detection
    Tian, Zhi
    Shen, Chunhua
    Chen, Hao
    He, Tong
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9626 - 9635
  • [15] AMA-Det: Enhancing Shared Head of One-Stage Object Detection With Adaptation, Merging, and Alignment
    Cheng, Song
    Li, Feng-Yue
    Qiao, Shu-Shan
    Shang, De-Long
    Zhou, Yu-Mei
    IEEE ACCESS, 2023, 11 : 11377 - 11389
  • [16] Evaluation of Fully Convolutional One-Stage Object Detection for Drone Detection
    Nayak, Abhijeet
    Bouazizi, Mondher
    Ahmad, Tasweer
    Goncalves, Artur
    Rigault, Bastien
    Jain, Raghvendra
    Matsuo, Yutaka
    Prendinger, Helmut
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2022 WORKSHOPS, PT II, 2022, 13374 : 434 - 445
  • [17] Object Detection Method Based On Improved One-Stage Detector
    Wei Hongtao
    Yang Xi
    2020 5TH INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA 2020), 2020, : 209 - 212
  • [18] AP-Loss for Accurate One-Stage Object Detection
    Chen, Kean
    Lin, Weiyao
    Li, Jianguo
    See, John
    Wang, Ji
    Zou, Junni
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (11) : 3782 - 3798
  • [19] Weighted Feature Pyramid Network for One-Stage Object Detection
    Tu, Xiaobo
    Zhan, Yongzhao
    IMAGE AND GRAPHICS, ICIG 2019, PT I, 2019, 11901 : 606 - 617
  • [20] Correction to: A compression pipeline for one-stage object detection model
    Zhishan Li
    Yiran Sun
    Guanzhong Tian
    Lei Xie
    Yong Liu
    Hongye Su
    Yifan He
    Journal of Real-Time Image Processing, 2021, 18 (6) : 1963 - 1964