Transformer Tracking for Satellite Video: Matching, Propagation, and Prediction

被引:0
|
作者
Zhao, Manqi [1 ,2 ]
Li, Shengyang [1 ,3 ]
Yang, Jian [1 ,3 ]
机构
[1] Chinese Acad Sci, Technol & Engn Ctr Space Utilizat, Key Lab Space Utilizat, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
[3] Univ Chinese Acad Sci, Sch Aeronaut & Astronaut, Beijing 100049, Peoples R China
关键词
Target tracking; Satellites; Transformers; Training; Object tracking; Predictive models; Pipelines; Adaptation models; Feature extraction; Accuracy; Satellite video object tracking; sequence prediction; static matching; temporal propagation; transformer; OBJECT TRACKING; CORRELATION FILTER;
D O I
10.1109/TGRS.2024.3501380
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Recently, transformer-based trackers have brought overwhelming advantages in general video. However, their performance in satellite video has been hindered by insufficient satellite-specific training and a lack of designs tailored to satellite targets and scene characteristics. To tackle these challenges, we propose a novel transformer-based tracking framework for satellite video object tracking: Transformer Matching, Propagation, and Prediction (TransMPP). TransMPP combines three stages: static matching, dynamic propagation, and prediction, to ensure accurate tracking in satellite videos. Specifically, the Matching model uses a one-stream pipeline for simultaneous feature extraction and relationship modeling across extensive search and template areas, thereby improving foreground and background discrimination capabilities. In addition, the Propagation and Prediction models enhance temporal modeling capabilities through local long-term and short-term feature propagation and global sequence prediction, respectively, boosting tracking robustness. Moreover, to ensure a fair comparison and evaluation, we also developed SatSOT-train, a large-scale training dataset for the SatSOT benchmark. After comprehensive training, TransMPP demonstrates state-of-the-art (SOTA) performance on the SatSOT dataset, achieving an area under the curve (AUC) score of 59.9% and a precision score of 71.5%, bringing improvements of 6.3% and 5.3%, respectively. The code will be available at https://github.com/DonDominic/TransMPP.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] IoUformer: Pseudo-IoU prediction with transformer for visual tracking
    Cai, Huayue
    Lan, Long
    Zhang, Jing
    Zhang, Xiang
    Zhan, Yibing
    Luo, Zhigang
    NEURAL NETWORKS, 2024, 170 : 548 - 563
  • [42] Intra-Prediction Mode Propagation for Video Coding
    Zhang, Kai
    Zhang, Li
    Chien, Wei-Jung
    Karczewicz, Marta
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (01) : 110 - 121
  • [43] PREDICTION-DECISION NETWORK FOR VIDEO OBJECT TRACKING
    Sun, Yasheng
    He, Tao
    Peng, Yinghong
    Qi, Jin
    Hu, Jie
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 271 - 275
  • [44] Tracking and matching connected components from 3D video
    Pires, DD
    Cesar, RM
    Vieira, MB
    Velho, L
    SIBGRAPI 2005: XVIII BRAZILIAN SYMPOSIUM ON COMPUTER GRAPHICS AND IMAGE PROCESSING, CONFERENCE PROCEEDINGS, 2005, : 257 - 264
  • [45] Automatic segmentation of video object plane based on object tracking and matching
    Shi, L
    Zhang, ZY
    An, P
    PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2001, : 510 - 513
  • [46] Automatic segmentation of video object plane based on object tracking and matching
    Shi, L
    Zhang, ZY
    Wang, H
    IMAGE EXTRACTION, SEGMENTATION, AND RECOGNITION, 2001, 4550 : 28 - 33
  • [47] Cell tracking in microscopic video using matching and linking of bipartite graphs
    Chatterjee, Rohit
    Ghosh, Mayukh
    Chowdhury, Ananda S.
    Ray, Nilanjan
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2013, 112 (03) : 422 - 431
  • [48] Object tracking in video pictures based on image segmentation and pattern matching
    Morimoto, T
    Kiriyama, O
    Harada, Y
    Adachi, H
    Koide, T
    Mattausch, HJ
    2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 3215 - 3218
  • [49] Dynamic object tracking by partial shape matching for video surveillance applications
    Husain, Mustafa
    Saber, Eli
    Misic, Vladimir
    Joralemon, Stephen P.
    2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS, 2006, : 2405 - +
  • [50] Tracking Object in Video Pictures based on Background Subtraction and Image Matching
    Dharamadhat, Thammapong
    Thanasoontornlerk, Kittipong
    Kanongchaiyos, Pizzanu
    2008 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS, VOLS 1-4, 2009, : 1255 - 1260