SslTransT: Self-supervised pre-training visual object tracking with Transformers

被引：0

作者：

Cai, Yannan ^{[1
]}

Tan, Ke ^{[1
]}

Wei, Zhenzhong ^{[1
]}

机构：

[1] Beihang Univ, Sch Instrumentat Sci & Optoelect Engn, Key Lab Precis Optomechatron Technol, Minist Educ, Beijing 100191, Peoples R China

来源：

OPTICS COMMUNICATIONS | 2024年 / 557卷

基金：

中国国家自然科学基金;

关键词：

Self-supervised; Hybrid CNN-transformer; Visual object tracking; 6D pose measurement system; BENCHMARK;

D O I：

10.1016/j.optcom.2024.130329

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

Transformer-based visual object tracking surpasses conventional CNN-based counterparts in superior performance but comes with additional computational overhead. Existing Transformer-based trackers rely on large-scale annotated data and longer training periods. To address this issue, we introduce a self-supervised pretext task, named target localization, which randomly crops the target and then pastes it onto various background images. The copy-paste-transform data augmentation strategy can composite sufficient training data and facilitate routine training. In addition, freezing the CNN backbone during pre -training and randomly adjusting template and search area factors further lead to faster training convergence. Extensive experiments both on public tracking benchmarks and real aircraft flight test videos demonstrate that our proposed tracker SslTransT significantly outperforms the baseline performance while requiring only half the training time. Furthermore, we apply SslTransT to a 6D pose measurement system based on vision and laser ranging, achieving excellent tracking results while running in real -time.

引用

页数：10

共 50 条

[41] Class incremental learning with self-supervised pre-training and prototype learning
Liu, Wenzhuo
Wu, Xin-Jian
Zhu, Fei
Yu, Ming-Ming
Wang, Chuang
Liu, Cheng-Lin
PATTERN RECOGNITION, 2025, 157
[42] Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering
Yang, Yaming
Guan, Ziyu
Wang, Zhe
Zhao, Wei
Xu, Cai
Lu, Weigang
Huang, Jianbin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[43] Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds
Hess, Georg
Jaxing, Johan
Svensson, Elias
Hagerman, David
Petersson, Christoffer
Svensson, Lennart
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW), 2023, : 350 - 359
[44] Feature-Suppressed Contrast for Self-Supervised Food Pre-training
Liu, Xinda
Zhu, Yaohui
Liu, Linhu
Tian, Jiang
Wang, Lili
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4359 - 4367
[45] Self-supervised Pre-training with Acoustic Configurations for Replay Spoofing Detection
Shim, Hye-jin
Heo, Hee-Soo
Jung, Jee-weon
Yu, Ha-Jin
INTERSPEECH 2020, 2020, : 1091 - 1095
[46] PreTraM: Self-supervised Pre-training via Connecting Trajectory and Map
Xu, Chenfeng
Li, Tian
Tang, Chen
Sun, Lingfeng
Keutzer, Kurt
Tomizuka, Masayoshi
Fathi, Alireza
Zhan, Wei
COMPUTER VISION, ECCV 2022, PT XXXIX, 2022, 13699 : 34 - 50
[47] MULTI-TASK SELF-SUPERVISED PRE-TRAINING FOR MUSIC CLASSIFICATION
Wu, Ho-Hsiang
Kao, Chieh-Chi
Tang, Qingming
Sun, Ming
McFee, Brian
Bello, Juan Pablo
Wang, Chao
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 556 - 560
[48] Self-supervised Pre-training and Semi-supervised Learning for Extractive Dialog Summarization
Zhuang, Yingying
Song, Jiecheng
Sadagopan, Narayanan
Beniwal, Anurag
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 1069 - 1076
[49] Self-Supervised Underwater Image Generation for Underwater Domain Pre-Training
Wu, Zhiheng
Wu, Zhengxing
Chen, Xingyu
Lu, Yue
Yu, Junzhi
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 14
[50] COMPARISON OF SELF-SUPERVISED SPEECH PRE-TRAINING METHODS ON FLEMISH DUTCH
Poncelet, Jakob
Hamme, Hugo Van
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 169 - 176

← 1 2 3 4 5 →