A survey: object detection methods from CNN to transformer

被引:52
|
作者
Arkin, Ershat [1 ]
Yadikar, Nurbiya [1 ]
Xu, Xuebin [1 ]
Aysa, Alimjan [2 ]
Ubul, Kurban [1 ,2 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Peoples R China
[2] Xinjiang Univ, Key Lab Multilingual Informat Technol, Urumqi 830046, Peoples R China
基金
美国国家科学基金会;
关键词
Computer vision; Object detection; Real-time system; CNN; Transformer; NETWORKS;
D O I
10.1007/s11042-022-13801-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection is the most important problem in computer vision tasks. After AlexNet proposed, based on Convolutional Neural Network (CNN) methods have become mainstream in the computer vision field, many researches on neural networks and different transformations of algorithm structures have appeared. In order to achieve fast and accurate detection effects, it is necessary to jump out of the existing CNN framework and has great challenges. Transformer's relatively mature theoretical support and technological development in the field of Natural Language Processing have brought it into the researcher's sight, and it has been proved that Transformer's method can be used for computer vision tasks, and proved that it exceeds the existing CNN method in some tasks. In order to enable more researchers to better understand the development process of object detection methods, existing methods, different frameworks, challenging problems and development trends, paper introduced historical classic methods of object detection used CNN, discusses the highlights, advantages and disadvantages of these algorithms. By consulting a large amount of paper, the paper compared different CNN detection methods and Transformer detection methods. Vertically under fair conditions, 13 different detection methods that have a broad impact on the field and are the most mainstream and promising are selected for comparison. The comparative data gives us confidence in the development of Transformer and the convergence between different methods. It also presents the recent innovative approaches to using Transformer in computer vision tasks. In the end, the challenges, opportunities and future prospects of this field are summarized.
引用
收藏
页码:21353 / 21383
页数:31
相关论文
共 50 条
  • [31] Interactive Transformer for Small Object Detection
    Wei, Jian
    Wang, Qinzhao
    Zhao, Zixu
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 77 (02): : 1699 - 1717
  • [32] Richer Information Transformer for Object Detection
    Yao, Shunyu
    Qi, Ke
    Chen, Wenbin
    Zhou, Yicong
    2022 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING, MLNLP 2022, 2022, : 110 - 114
  • [33] Transformer for object detection: Review and benchmark
    Li, Yong
    Miao, Naipeng
    Ma, Liangdi
    Shuang, Feng
    Huang, Xingwen
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [34] DESTR: Object Detection with Split Transformer
    He, Liqiang
    Todorovic, Sinisa
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9367 - 9376
  • [35] DESTR: Object Detection with Split Transformer
    He, Liqiang
    Todorovic, Sinisa
    Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, 2022-June : 9367 - 9376
  • [36] TC-Radar: Transformer-CNN Hybrid Network for Millimeter-Wave Radar Object Detection
    Jia, Fengde
    Li, Chenyang
    Bi, Siyi
    Qian, Junhui
    Wei, Leizhe
    Sun, Guohao
    REMOTE SENSING, 2024, 16 (16)
  • [37] CTFU-Net:CNN-Transformer Fusion U-shaped Network for Moving Object Detection
    Xia, Tingting
    Yang, Yizhong
    2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 44 - 50
  • [38] MPTC-FPN: A Multilayer Progressive FPN With Transformer-CNN Based Encoder for Salient Object Detection
    Yang, Xiaoqi
    Duan, Liangliang
    IEEE ACCESS, 2022, 10 : 98816 - 98827
  • [39] A survey and performance evaluation of deep learning methods for small object detection
    Liu, Yang
    Sun, Peng
    Wergeles, Nickolas
    Shang, Yi
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 172
  • [40] Survey and systematization of 3D object detection models and methods
    Moritz Drobnitzky
    Jonas Friederich
    Bernhard Egger
    Patrick Zschech
    The Visual Computer, 2024, 40 : 1867 - 1913