Scale-aware token-matching for transformer-based object detector

被引:1
|
作者
Jung, Aecheon [1 ]
Hong, Sungeun [1 ]
Hyun, Yoonsuk [2 ]
机构
[1] Sungkyunkwan Univ, Dept Immers Media Engn, Seoul, South Korea
[2] Inha Univ, Dept Math, Incheon, South Korea
基金
新加坡国家研究基金会;
关键词
Vision transformer; Object detection; Small object detection;
D O I
10.1016/j.patrec.2024.08.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Owing to the advancements in deep learning, object detection has made significant progress in estimating the positions and classes of multiple objects within an image. However, detecting objects of various scales within a single image remains a challenging problem. In this study, we suggest a scale-aware token matching to predict the positions and classes of objects for transformer-based object detection. We train a model by matching detection tokens with ground truth considering its size, unlike the previous methods that performed matching without considering the scale during the training process. We divide one detection token set into multiple sets based on scale and match each token set differently with ground truth, thereby, training the model without additional computation costs. The experimental results demonstrate that scale information can be assigned to tokens. Scale-aware tokens can independently learn scale-specific information by using a novel loss function, which improves the detection performance on small objects.
引用
收藏
页码:197 / 202
页数:6
相关论文
共 50 条
  • [1] SATCount: A scale-aware transformer-based class-agnostic counting framework
    Wang, Yutian
    Yang, Bin
    Wang, Xi
    Liang, Chao
    Chen, Jun
    NEURAL NETWORKS, 2024, 172
  • [2] Focal DETR: Target-Aware Token Design for Transformer-Based Object Detection
    Xie, Tianming
    Zhang, Zhonghao
    Tian, Jing
    Ma, Lihong
    SENSORS, 2022, 22 (22)
  • [3] ScaleKD: Distilling Scale-Aware Knowledge in Small Object Detector
    Zhu, Yichen
    Zhou, Qiqi
    Liu, Ning
    Xu, Zhiyuan
    Ou, Zhicai
    Mou, Xiaofeng
    Tang, Jian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19723 - 19733
  • [4] Scale-Aware Modulation Meet Transformer
    Lin, Weifeng
    Wu, Ziheng
    Chen, Jiayu
    Huang, Jun
    Jin, Lianwen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5992 - 6003
  • [5] ScopeViT: Scale-Aware Vision Transformer
    Nie, Xuesong
    Jin, Haoyuan
    Yan, Yunfeng
    Chen, Xi
    Zhu, Zhihang
    Qi, Donglian
    PATTERN RECOGNITION, 2024, 153
  • [6] Multi-object tracking with scale-aware transformer and enhanced association strategy
    Xiang, Xuezhi
    Zhou, Xiankun
    Wang, Xinyao
    Zhai, Mingliang
    El Saddik, Abdulmotaleb
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [7] Robust Scale-Aware Stereo Matching Network
    Okae J.
    Li B.
    Du J.
    Hu Y.
    IEEE Transactions on Artificial Intelligence, 2022, 3 (02): : 244 - 253
  • [8] Scale-Aware Trident Networks for Object Detection
    Li, Yanghao
    Chen, Yuntao
    Wang, Naiyan
    Zhang, Zhaoxiang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6053 - 6062
  • [9] Scale-aware Automatic Augmentation for Object Detection
    Chen, Yukang
    Li, Yanwei
    Kong, Tao
    Qi, Lu
    Chu, Ruihang
    Li, Lei
    Jia, Jiaya
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9558 - 9567
  • [10] Scale-Aware Pixelwise Object Proposal Networks
    Jie, Zequn
    Liang, Xiaodan
    Feng, Jiashi
    Lu, Wen Feng
    Tay, Eng Hock Francis
    Yan, Shuicheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (10) : 4525 - 4539