TrackingMamba: Visual State Space Model for Object Tracking

被引:2
|
作者
Wang, Qingwang [1 ,2 ]
Zhou, Liyao [1 ,2 ]
Jin, Pengcheng [1 ,2 ]
Xin, Qu [1 ,2 ]
Zhong, Hangwei [1 ,2 ]
Song, Haochen [1 ,2 ]
Shen, Tao [1 ,2 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming 650500, Peoples R China
[2] Kunming Univ Sci & Technol, Yunnan Key Lab Comp Technol Applicat, Kunming 650500, Peoples R China
基金
中国国家自然科学基金;
关键词
Object tracking; Autonomous aerial vehicles; Transformers; Feature extraction; Computational modeling; Accuracy; Visualization; Jungle scenes; Mamba; object tracking; UAV remote sensing;
D O I
10.1109/JSTARS.2024.3458938
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, UAV object tracking has provided technical support across various fields. Most existing work relies on convolutional neural networks (CNNs) or visual transformers. However, CNNs have limited receptive fields, resulting in suboptimal performance, while transformers require substantial computational resources, making training and inference challenging. Mountainous and jungle environments-critical components of the Earth's surface and key scenarios for UAV object tracking-present unique challenges due to steep terrain, dense vegetation, and rapidly changing weather conditions, which complicate UAV tracking. The lack of relevant datasets further reduces tracking accuracy. This article introduces a new tracking framework based on a state-space model called TrackingMamba, which uses a single-stream tracking architecture with Vision Mamba as its backbone. TrackingMamba not only matches transformer-based trackers in global feature extraction and long-range dependence modeling but also maintains computational efficiency with linear growth. Compared to other advanced trackers, TrackingMamba delivers higher accuracy with a simpler model framework, fewer parameters, and reduced FLOPs. Specifically, on the UAV123 benchmark, TrackingMamba outperforms the baseline model OSTtrack-256, improving AUC by 2.59% and Precision by 4.42%, while reducing parameters by 95.52% and FLOPs by 95.02%. The article also evaluates the performance and shortcomings of TrackingMamba and other advanced trackers in the complex and critical context of jungle environments, and it explores potential future research directions in UAV jungle object tracking.
引用
收藏
页码:16744 / 16754
页数:11
相关论文
共 50 条
  • [31] Robust visual tracking of infrared object via sparse representation model
    Ma, Junkai
    Luo, Haibo
    Chang, Zheng
    Hui, Bin
    INTERNATIONAL SYMPOSIUM ON OPTOELECTRONIC TECHNOLOGY AND APPLICATION 2014: IMAGE PROCESSING AND PATTERN RECOGNITION, 2014, 9301
  • [32] Visual tracking based on object modeling using probabilistic graphical model
    Gao, Lin
    Tang, Peng
    Sheng, Peng
    Guangdianzi Jiguang/Journal of Optoelectronics Laser, 2010, 21 (01): : 124 - 129
  • [33] A model of dynamic visual attention for object tracking in natural image sequences
    Ouerhani, N
    Hügli, H
    COMPUTATIONAL METHODS IN NEURAL MODELING, PT 1, 2003, 2686 : 702 - 709
  • [34] A hybrid motion and appearance prediction model for robust visual object tracking
    Jahandide, Hamidreza
    Mohamedpour, Kamal
    Moghaddam, Hamid Abrishami
    PATTERN RECOGNITION LETTERS, 2012, 33 (16) : 2192 - 2197
  • [35] Visual Object Tracking Based on Multi-exemplar Regression Model
    Zhang Yuanqiang
    Zha Yufei
    Ku Tao
    Wu Min
    Bi Duyan
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2018, 40 (05) : 1202 - 1209
  • [36] A robust and adaptive framework with space-time memory networks for Visual Object Tracking☆
    Zheng, Yu
    Liu, Yong
    Che, Xun
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2025, 108
  • [37] Visual object tracking via time-space constraints and sparse representation classification
    Kuang, Jin-Jun
    Chai, Yi
    Xiong, Qing-Yu
    Kongzhi yu Juece/Control and Decision, 2013, 28 (09): : 1355 - 1360
  • [38] Visual object tracking in scale-space by artificial neural network with reference frames
    Jia, Jingping
    Xia, Hong
    Ma, Wei
    Wang, Zhuxiao
    Journal of Information and Computational Science, 2015, 12 (01): : 73 - 80
  • [39] Robust visual servo control of a mobile robot for object tracking in shape parameter space
    Jean, JH
    Wu, TP
    2004 43RD IEEE CONFERENCE ON DECISION AND CONTROL (CDC), VOLS 1-5, 2004, : 4016 - 4021
  • [40] Design of active contour model for object tracking with snaxel point space
    Kim, CT
    Lee, JJ
    PROCEEDINGS OF THE 2003 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM 2003), VOLS 1 AND 2, 2003, : 1402 - 1405