Online Multi-Scale Classification and Global Feature Modulation for Robust Visual Tracking

被引：3

作者：

Gao, Qi ^{[1
]}

Yin, Mingfeng ^{[2
]}

Wu, Xiang ^{[3
]}

Liu, Di ^{[4
]}

Bo, Yuming ^{[3
]}

机构：

[1] Jiangsu Univ Technol, Coll Mech Engn, Changzhou 213001, Peoples R China

[2] Jiangsu Univ Technol, Sch Automobile & Traff Engn, Changzhou 213001, Peoples R China

[3] Nanjing Univ Sci & Technol, Sch Automat, Nanjing 210094, Peoples R China

[4] Nanjing Inst Technol, Sch Automat, Nanjing 211167, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Visualization; Target tracking; Accuracy; Fuses; Modulation; Transformers; Real-time systems; Visual object tracking; coordinate attention; online multi-scale classification; global feature modulation; OBJECT TRACKING;

D O I：

10.1109/TCSVT.2023.3343949

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recent advanced trackers, composed of discriminative classification and dedicated bounding box estimation, have achieved remarkable advancements in performance of visual object tracking. However, existing methods cannot satisfy the demands of tracking tasks in complex scenes, such as occlusion, scale variations, and etc. To this end, we propose a novel online multi-scale classification and global feature modulation for robust visual tracking, which is developed over accurate tracking by overlap maximization, named ATOM+. First, coordinate attention (CA) is applied to enhance the target features in the channel dimension and spatial dimension, which can effectively optimize the feature representation ability of the backbone network. Second, an online multi-scale classification (OMC) module is designed. During the online tracking phase, more reliable matching responses are comprehensively generated by aggregating information from different scales related to the target. This new operation enables stable perception of the target by the tracker, particularly when severe changes in the appearance and posture of the target are encountered. Third, a global feature modulation (GFM) mechanism is constructed, which requires only a small amount of computational resources, to fuse the spatial contextual information of the template image into the search region. This integration refines the bounding box to obtain an accurate estimate of the target state. Finally, comprehensive experiments on conventional tracking benchmarks of OTB100, LaSOT, and VOT2018 show that our tracker can sufficiently address different challenging scenarios, and achieves state-of-the-art performance. For the average running speed, our tracker can achieve 37 FPS in real time.

引用

页码：5321 / 5334

页数：14

共 50 条

[1] Exploiting multi-scale hierarchical feature representation for visual tracking
Wang, Jun
Yin, Peng
Yang, Wenhui
Wang, Yuanyun
Wang, Shengqian
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3617 - 3632
[2] Exploiting multi-scale hierarchical feature representation for visual tracking
Jun Wang
Peng Yin
Wenhui Yang
Yuanyun Wang
Shengqian Wang
Complex & Intelligent Systems, 2024, 10 : 3617 - 3632
[3] Robust visual tracking via identifying multi-scale patches
Liang, Yun
Li, Ke
Zhang, Jian
Wang, Meihua
Lin, Chen
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (11) : 14195 - 14230
[4] Robust visual tracking via identifying multi-scale patches
Yun Liang
Ke Li
Jian Zhang
Meihua Wang
Chen Lin
Multimedia Tools and Applications, 2019, 78 : 14195 - 14230
[5] GLOBAL AND MULTI-SCALE FEATURE LEARNING FOR REMOTE SENSING SCENE CLASSIFICATION
Xia, Ziying
Gan, Guolong
Liu, Siyu
Cao, Wei
Cheng, Jian
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 655 - 658
[6] MSTrack: Visual Tracking with Multi-scale Attention
Song, Chunlin
Yao, Yu
Guo, Jianhui
Li, Lunbo
PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 337 - 344
[7] Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning
Xue, Wanli
Xu, Chao
Feng, Zhiyong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) : 2849 - 2860
[8] Robust visual tracking via online informative feature selection
Song, Huihui
ELECTRONICS LETTERS, 2014, 50 (25) : 1931 - 1932
[9] Qualitative multi-scale feature hierarchies for object tracking
Bretzner, L
Lindeberg, T
SCALE-SPACE THEORIES IN COMPUTER VISION, 1999, 1682 : 117 - 128
[10] Multi-scale binary robust independent feature descriptor
Yang, Changqing
Wang, Xiaotong
INFORMATION TECHNOLOGY, 2015, : 251 - 255

← 1 2 3 4 5 →