Transformer for multiple object tracking: Exploring locality to vision

被引:7
|
作者
Wu, Shan [1 ]
Hadachi, Amnir [1 ]
Lu, Chaoru [2 ]
Vivet, Damien [3 ]
机构
[1] Univ Tartu, Inst Comp Sci, ITS Lab, Narva mnt 18, EE-51009 Tartu, Estonia
[2] Oslo Metropolitan Univ, Ctr Metropolitan Digitalizat & Smartizat MetSmart, Dept Built Environm, Pilestredet 46, N-0167 Oslo, Norway
[3] Univ Toulouse, ISAE SUPAERO, 10 Ave Edouard Belin, F-31400 Toulouse, France
关键词
Multi-object tracking; Transformer; Deep learning; Locality to vision;
D O I
10.1016/j.patrec.2023.04.016
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-object tracking (MOT) is a critical task in various domains, such as traffic analysis, surveillance, and autonomous vehicles. The joint-detection-and-tracking paradigm has been extensively researched, which is faster and more convenient for training and deploying over the classic tracking-by-detection paradigm while achieving state-of-the-art performance. This paper explores the possibilities of enhancing the MOT system by leveraging the prevailing convolutional neural network (CNN) and a novel vision transformer technique Locality. There are several deficiencies in the transformer adopted for computer vision tasks. While the transformers are good at modeling global information for a long embedding, the locality mech-anism, which learns the local features, is missing. This could lead to negligence of small objects, which may cause security issues. We combine the TransTrack MOT system with the locality mechanism in-spired by LocalViT and find that the locality-enhanced system outperforms the baseline TransTrack by 5.3% MOTA on the MOT17 dataset. (c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:70 / 76
页数:7
相关论文
共 50 条
  • [31] Detecting single-target changes in multiple object tracking: The case of peripheral vision
    Vater, Christian
    Kredel, Ralf
    Hossner, Ernst-Joachim
    JOURNAL OF SPORT & EXERCISE PSYCHOLOGY, 2015, 37 : S64 - S64
  • [32] Detecting single-target changes in multiple object tracking: The case of peripheral vision
    Vater, Christian
    Kredel, Ralf
    Hossner, Ernst-Joachim
    ATTENTION PERCEPTION & PSYCHOPHYSICS, 2016, 78 (04) : 1004 - 1019
  • [33] Detecting single-target changes in multiple object tracking: The case of peripheral vision
    Christian Vater
    Ralf Kredel
    Ernst-Joachim Hossner
    Attention, Perception, & Psychophysics, 2016, 78 : 1004 - 1019
  • [34] Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking With Transformer
    Luo, Zhipeng
    Zhou, Changqing
    Pan, Liang
    Zhang, Gongjie
    Liu, Tianrui
    Luo, Yueru
    Zhao, Haiyu
    Liu, Ziwei
    Lu, Shijian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 5921 - 5935
  • [35] Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking with Transformer
    Luo, Zhipeng
    Zhou, Changqing
    Pan, Liang
    Zhang, Gongjie
    Liu, Tianrui
    Luo, Yueru
    Zhao, Haiyu
    Liu, Ziwei
    Lu, Shijian
    arXiv, 2022,
  • [36] Study on Vision Object Tracking based on Adaptive Object Segmentation
    Hao Hui-juan
    Xu Ji-yong
    Liu Guang-qi
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND PATTERN RECOGNITION IN INDUSTRIAL ENGINEERING, 2010, 7820
  • [37] Object-Centric Multiple Object Tracking
    Zhao, Zixu
    Wang, Jiaze
    Horn, Max
    Ding, Yizhuo
    He, Tong
    Bai, Zechen
    Zietlow, Dominik
    Simon-Gabriel, Carl-Johann
    Shuai, Bing
    Tu, Zhuowen
    Brox, Thomas
    Schiele, Bernt
    Fu, Yanwei
    Locatello, Francesco
    Zhang, Zheng
    Xiao, Tianjun
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16555 - 16565
  • [38] The Complexity of Object Association in Multiple Object Tracking
    Ganian, Robert
    Hamm, Thekla
    Ordyniak, Sebastian
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1388 - 1396
  • [39] ViTOL: Vision Transformer for Weakly Supervised Object Localization
    Gupta, Saurav
    Lakhotia, Sourav
    Rawat, Abhay
    Tallamraju, Rahul
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4100 - 4109
  • [40] Multiple object tracking in amblyopia
    Domsa, P.
    Kortvelyes, J.
    Gal, V.
    Vidnyanszky, Z.
    PERCEPTION, 2007, 36 : 88 - 88