Scale-aware token-matching for transformer-based object detector

被引:1
|
作者
Jung, Aecheon [1 ]
Hong, Sungeun [1 ]
Hyun, Yoonsuk [2 ]
机构
[1] Sungkyunkwan Univ, Dept Immers Media Engn, Seoul, South Korea
[2] Inha Univ, Dept Math, Incheon, South Korea
基金
新加坡国家研究基金会;
关键词
Vision transformer; Object detection; Small object detection;
D O I
10.1016/j.patrec.2024.08.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Owing to the advancements in deep learning, object detection has made significant progress in estimating the positions and classes of multiple objects within an image. However, detecting objects of various scales within a single image remains a challenging problem. In this study, we suggest a scale-aware token matching to predict the positions and classes of objects for transformer-based object detection. We train a model by matching detection tokens with ground truth considering its size, unlike the previous methods that performed matching without considering the scale during the training process. We divide one detection token set into multiple sets based on scale and match each token set differently with ground truth, thereby, training the model without additional computation costs. The experimental results demonstrate that scale information can be assigned to tokens. Scale-aware tokens can independently learn scale-specific information by using a novel loss function, which improves the detection performance on small objects.
引用
收藏
页码:197 / 202
页数:6
相关论文
共 50 条
  • [21] YOLO-Drone: A Scale-Aware Detector for Drone Vision
    Li, Yutong
    Ma, Miao
    Liu, Shichang
    Yao, Chao
    Guo, Longjiang
    CHINESE JOURNAL OF ELECTRONICS, 2024, 33 (04) : 1034 - 1045
  • [22] Scale-Aware Automatic Augmentations for Object Detection With Dynamic Training
    Chen, Yukang
    Zhang, Peizhen
    Kong, Tao
    Li, Yanwei
    Zhang, Xiangyu
    Qi, Lu
    Sun, Jian
    Jia, Jiaya
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2367 - 2383
  • [23] AdaZoom: Towards Scale-Aware Large Scene Object Detection
    Xu, Jingtao
    Li, Ya-Li
    Wang, Shengjin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4598 - 4609
  • [24] Scale-Aware Squeeze-and-Excitation for Lightweight Object Detection
    Xu, Zhihua
    Hong, Xiaobin
    Chen, Tianshui
    Yang, Zhijing
    Shi, Yukai
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (01) : 49 - 56
  • [25] Scale-aware feature pyramid architecture for marine object detection
    Xu, Fengqiang
    Wang, Huibing
    Peng, Jinjia
    Fu, Xianping
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (08): : 3637 - 3653
  • [26] SSD-MonoDETR: Supervised Scale-Aware Deformable Transformer for Monocular 3D Object Detection
    He, Xuan
    Yang, Fan
    Yang, Kailun
    Lin, Jiacheng
    Fu, Haolong
    Wang, Meng
    Yuan, Jin
    Li, Zhiyong
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 555 - 567
  • [27] Local to Global: A Sparse Transformer-Based Small Object Detector for Remote Sensing Images
    Li, Zheng
    Wang, Yongcheng
    Feng, Hao
    Chen, Chi
    Xu, Dongdong
    Zhao, Tianqi
    Gao, Yunxiao
    Zhao, Zhikang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [28] Transformer-Based Detector for OFDM With Index Modulation
    Zhang, Dexin
    Wang, Sixian
    Niu, Kai
    Dai, Jincheng
    Wang, Sen
    Yuan, Yifei
    IEEE COMMUNICATIONS LETTERS, 2022, 26 (06) : 1313 - 1317
  • [29] Scale-Aware Multi-branch Decoder for Salient Object Detection
    Lin, Yang
    Zhou, Huajun
    Xie, Xiaohua
    Lai, Jianhuang
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 280 - 292
  • [30] An enhanced vision transformer with scale-aware and spatial-aware attention for thighbone fracture detection
    Guan B.
    Yao J.
    Zhang G.
    Neural Computing and Applications, 2024, 36 (19) : 11425 - 11438