SslTransT: Self-supervised pre-training visual object tracking with Transformers

被引：0

作者：

Cai, Yannan ^{[1
]}

Tan, Ke ^{[1
]}

Wei, Zhenzhong ^{[1
]}

机构：

[1] Beihang Univ, Sch Instrumentat Sci & Optoelect Engn, Key Lab Precis Optomechatron Technol, Minist Educ, Beijing 100191, Peoples R China

来源：

OPTICS COMMUNICATIONS | 2024年 / 557卷

基金：

中国国家自然科学基金;

关键词：

Self-supervised; Hybrid CNN-transformer; Visual object tracking; 6D pose measurement system; BENCHMARK;

D O I：

10.1016/j.optcom.2024.130329

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

Transformer-based visual object tracking surpasses conventional CNN-based counterparts in superior performance but comes with additional computational overhead. Existing Transformer-based trackers rely on large-scale annotated data and longer training periods. To address this issue, we introduce a self-supervised pretext task, named target localization, which randomly crops the target and then pastes it onto various background images. The copy-paste-transform data augmentation strategy can composite sufficient training data and facilitate routine training. In addition, freezing the CNN backbone during pre -training and randomly adjusting template and search area factors further lead to faster training convergence. Extensive experiments both on public tracking benchmarks and real aircraft flight test videos demonstrate that our proposed tracker SslTransT significantly outperforms the baseline performance while requiring only half the training time. Furthermore, we apply SslTransT to a 6D pose measurement system based on vision and laser ranging, achieving excellent tracking results while running in real -time.

引用

页数：10

共 50 条

[31] Multimodal Visual-Tactile Representation Learning through Self-Supervised Contrastive Pre-Training
Dave, Vedant
Lygerakis, Fotios
Rueckert, Elmar
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 8013 - 8020
[32] Self-Supervised Pre-Training Joint Framework: Assisting Lightweight Detection Network for Underwater Object Detection
Wang, Zhuo
Chen, Haojie
Qin, Hongde
Chen, Qin
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (03)
[33] Joint Encoder-Decoder Self-Supervised Pre-training for ASR
Arunkumar, A.
Umesh, S.
INTERSPEECH 2022, 2022, : 3418 - 3422
[34] ENHANCING THE DOMAIN ROBUSTNESS OF SELF-SUPERVISED PRE-TRAINING WITH SYNTHETIC IMAGES
Hassan, Mohamad N. C.
Bhattacharya, Avigyan
da Costa, Victor G. Turrisi
Banerjee, Biplab
Ricci, Elisa
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5470 - 5474
[35] Individualized Stress Mobile Sensing Using Self-Supervised Pre-Training
Islam, Tanvir
Washington, Peter
APPLIED SCIENCES-BASEL, 2023, 13 (21):
[36] Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
Huang, Sung-Feng
Chuang, Shun-Po
Liu, Da-Rong
Chen, Yi-Chen
Yang, Gene-Ping
Lee, Hung-yi
INTERSPEECH 2021, 2021, : 3056 - 3060
[37] Self-Supervised Pre-training for Protein Embeddings Using Tertiary Structures
Guo, Yuzhi
Wu, Jiaxiang
Ma, Hehuan
Huang, Junzhou
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6801 - 6809
[38] DialogueBERT: A Self-Supervised Learning based Dialogue Pre-training Encoder
Zhang, Zhenyu
Guo, Tao
Chen, Meng
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3647 - 3651
[39] Progressive self-supervised learning: A pre-training method for crowd counting
Gu, Yao
Zheng, Zhe
Wu, Yingna
Xie, Guangping
Ni, Na
PATTERN RECOGNITION LETTERS, 2025, 188 : 148 - 154
[40] GUIDED CONTRASTIVE SELF-SUPERVISED PRE-TRAINING FOR AUTOMATIC SPEECH RECOGNITION
Khare, Aparna
Wu, Minhua
Bhati, Saurabhchand
Droppo, Jasha
Maas, Roland
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 174 - 181

← 1 2 3 4 5 →