Scale-aware token-matching for transformer-based object detector

被引:1
|
作者
Jung, Aecheon [1 ]
Hong, Sungeun [1 ]
Hyun, Yoonsuk [2 ]
机构
[1] Sungkyunkwan Univ, Dept Immers Media Engn, Seoul, South Korea
[2] Inha Univ, Dept Math, Incheon, South Korea
基金
新加坡国家研究基金会;
关键词
Vision transformer; Object detection; Small object detection;
D O I
10.1016/j.patrec.2024.08.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Owing to the advancements in deep learning, object detection has made significant progress in estimating the positions and classes of multiple objects within an image. However, detecting objects of various scales within a single image remains a challenging problem. In this study, we suggest a scale-aware token matching to predict the positions and classes of objects for transformer-based object detection. We train a model by matching detection tokens with ground truth considering its size, unlike the previous methods that performed matching without considering the scale during the training process. We divide one detection token set into multiple sets based on scale and match each token set differently with ground truth, thereby, training the model without additional computation costs. The experimental results demonstrate that scale information can be assigned to tokens. Scale-aware tokens can independently learn scale-specific information by using a novel loss function, which improves the detection performance on small objects.
引用
收藏
页码:197 / 202
页数:6
相关论文
共 50 条
  • [41] EF-DETR: A Lightweight Transformer-Based Object Detector With an Encoder-Free Neck
    Cheng, Siyi
    Song, Jingnan
    Zhou, Mingliang
    Wei, Xuekai
    Pu, Huayan
    Luo, Jun
    Jia, Weijia
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (11) : 12994 - 13002
  • [42] Towards Efficient Use of Multi-Scale Features in Transformer-Based Object Detectors
    Zhang, Gongjie
    Luo, Zhipeng
    Tian, Zichen
    Zhang, Jingyi
    Zhang, Xiaoqin
    Lu, Shijian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6206 - 6216
  • [43] Scale-Aware Cartographic Displacement Based on Constrained Optimization
    Maruyama, Ken
    Takahashi, Shigeo
    Wu, Hsiang-Yun
    Misue, Kazuo
    Arikawa, Masatoshi
    2019 23RD INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV): BIOMEDICAL VISUALIZATION AND GEOMETRIC MODELLING & IMAGING, 2019, : 74 - 80
  • [44] Transformer-Based Rating-Aware Sequential Recommendation
    Li, Yang
    Li, Qianmu
    Meng, Shunmei
    Hou, Jun
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT I, 2022, 13155 : 759 - 774
  • [45] Symmetry-Aware Transformer-Based Mirror Detection
    Huang, Tianyu
    Dong, Bowen
    Lin, Jiaying
    Liu, Xiaohui
    Lau, Rynson W. H.
    Zuo, Wangmeng
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 935 - 943
  • [46] Scale-Aware Regional Collective Feature Enhancement Network for Scene Object Detection
    Li, Yiyao
    Liu, Jin
    Gao, Zhenyu
    NEURAL PROCESSING LETTERS, 2023, 55 (05) : 6289 - 6310
  • [47] Learning region-guided scale-aware feature selection for object detection
    Liu, Liu
    Wang, Rujing
    Xie, Chengjun
    Li, Rui
    Wang, Fangyuan
    Zhou, Man
    Teng, Yue
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (11): : 6389 - 6403
  • [48] Learning region-guided scale-aware feature selection for object detection
    Liu Liu
    Rujing Wang
    Chengjun Xie
    Rui Li
    Fangyuan Wang
    Man Zhou
    Yue Teng
    Neural Computing and Applications, 2021, 33 : 6389 - 6403
  • [49] Anchor-Free Object Detection with Scale-Aware Networks for Autonomous Driving
    Piao, Zhengquan
    Wang, Junbo
    Tang, Linbo
    Zhao, Baojun
    Zhou, Shichao
    ELECTRONICS, 2022, 11 (20)
  • [50] Scale-Aware Regional Collective Feature Enhancement Network for Scene Object Detection
    Yiyao Li
    Jin Liu
    Zhenyu Gao
    Neural Processing Letters, 2023, 55 : 6289 - 6310