A survey: object detection methods from CNN to transformer

被引:52
|
作者
Arkin, Ershat [1 ]
Yadikar, Nurbiya [1 ]
Xu, Xuebin [1 ]
Aysa, Alimjan [2 ]
Ubul, Kurban [1 ,2 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Peoples R China
[2] Xinjiang Univ, Key Lab Multilingual Informat Technol, Urumqi 830046, Peoples R China
基金
美国国家科学基金会;
关键词
Computer vision; Object detection; Real-time system; CNN; Transformer; NETWORKS;
D O I
10.1007/s11042-022-13801-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection is the most important problem in computer vision tasks. After AlexNet proposed, based on Convolutional Neural Network (CNN) methods have become mainstream in the computer vision field, many researches on neural networks and different transformations of algorithm structures have appeared. In order to achieve fast and accurate detection effects, it is necessary to jump out of the existing CNN framework and has great challenges. Transformer's relatively mature theoretical support and technological development in the field of Natural Language Processing have brought it into the researcher's sight, and it has been proved that Transformer's method can be used for computer vision tasks, and proved that it exceeds the existing CNN method in some tasks. In order to enable more researchers to better understand the development process of object detection methods, existing methods, different frameworks, challenging problems and development trends, paper introduced historical classic methods of object detection used CNN, discusses the highlights, advantages and disadvantages of these algorithms. By consulting a large amount of paper, the paper compared different CNN detection methods and Transformer detection methods. Vertically under fair conditions, 13 different detection methods that have a broad impact on the field and are the most mainstream and promising are selected for comparison. The comparative data gives us confidence in the development of Transformer and the convergence between different methods. It also presents the recent innovative approaches to using Transformer in computer vision tasks. In the end, the challenges, opportunities and future prospects of this field are summarized.
引用
收藏
页码:21353 / 21383
页数:31
相关论文
共 50 条
  • [21] HCLT-YOLO: A Hybrid CNN and Lightweight Transformer Architecture for Object Detection in Complex Traffic Scenes
    Chen, Zhige
    Yang, Kai
    Wu, Yandong
    Yang, Hao
    Tang, Xiaolin
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (03) : 3681 - 3694
  • [22] A Survey of Dense Object Detection Methods Based on Deep Learning
    Zhou, Yang
    Li, Hui
    IEEE ACCESS, 2024, 12 : 179944 - 179961
  • [23] A Survey of Generic Object Detection Methods Based on Deep Learning
    Cheng X.
    Song C.
    Shi J.-G.
    Zhou L.
    Zhang Y.-F.
    Zheng Y.-H.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (07): : 1428 - 1438
  • [24] FBDPN: CNN-Transformer hybrid feature boosting and differential pyramid network for underwater object detection
    Ji, Xun
    Chen, Shijie
    Hao, Li-Ying
    Zhou, Jingchun
    Chen, Long
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 256
  • [25] CTAFFNet: CNN-Transformer Adaptive Feature Fusion Object Detection Algorithm for Complex Traffic Scenarios
    Dong, Xinlong
    Shi, Peicheng
    Liang, Taonian
    Yang, Aixi
    TRANSPORTATION RESEARCH RECORD, 2024,
  • [26] Continual Detection Transformer for Incremental Object Detection
    Liu, Yaoyao
    Schiele, Bernt
    Vedaldi, Andrea
    Rupprecht, Christian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23799 - 23808
  • [27] Grid Based Spherical CNN for Object Detection from Panoramic Images
    Yu, Dawen
    Ji, Shunping
    SENSORS, 2019, 19 (11)
  • [28] A Study on Object Detection Method from Manga Images using CNN
    Yanagisawa, Hideaki
    Yamashita, Takuro
    Watanabe, Hiroshi
    2018 INTERNATIONAL WORKSHOP ON ADVANCED IMAGE TECHNOLOGY (IWAIT), 2018,
  • [29] DETECTION TRANSFORMER WITH DIVERSIFIED OBJECT QUERIES
    Senthivel, Tharsan
    Ngoc-Son Vu
    Borzic, Boris
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2515 - 2519
  • [30] Transformer for Object Re-identification: A Survey
    Ye, Mang
    Chen, Shuoyi
    Li, Chenyue
    Zheng, Wei-Shi
    Crandall, David
    Du, Bo
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 2410 - 2440