SslTransT: Self-supervised pre-training visual object tracking with Transformers

被引：0

作者：

Cai, Yannan ^{[1
]}

Tan, Ke ^{[1
]}

Wei, Zhenzhong ^{[1
]}

机构：

[1] Beihang Univ, Sch Instrumentat Sci & Optoelect Engn, Key Lab Precis Optomechatron Technol, Minist Educ, Beijing 100191, Peoples R China

来源：

OPTICS COMMUNICATIONS | 2024年 / 557卷

基金：

中国国家自然科学基金;

关键词：

Self-supervised; Hybrid CNN-transformer; Visual object tracking; 6D pose measurement system; BENCHMARK;

D O I：

10.1016/j.optcom.2024.130329

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

Transformer-based visual object tracking surpasses conventional CNN-based counterparts in superior performance but comes with additional computational overhead. Existing Transformer-based trackers rely on large-scale annotated data and longer training periods. To address this issue, we introduce a self-supervised pretext task, named target localization, which randomly crops the target and then pastes it onto various background images. The copy-paste-transform data augmentation strategy can composite sufficient training data and facilitate routine training. In addition, freezing the CNN backbone during pre -training and randomly adjusting template and search area factors further lead to faster training convergence. Extensive experiments both on public tracking benchmarks and real aircraft flight test videos demonstrate that our proposed tracker SslTransT significantly outperforms the baseline performance while requiring only half the training time. Furthermore, we apply SslTransT to a 6D pose measurement system based on vision and laser ranging, achieving excellent tracking results while running in real -time.

引用

页数：10

共 50 条

[21] Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene
Shrestha, Sulabh
Li, Yimeng
Kosecka, Jana
2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 625 - 635
[22] Self-supervised pre-training on industrial time-series
Biggio, Luca
Kastanis, Iason
2021 8TH SWISS CONFERENCE ON DATA SCIENCE, SDS, 2021, : 56 - 57
[23] DiT: Self-supervised Pre-training for Document Image Transformer
Li, Junlong
Xu, Yiheng
Lv, Tengchao
Cui, Lei
Zhang, Cha
Wei, Furu
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3530 - 3539
[24] CDS: Cross-Domain Self-supervised Pre-training
Kim, Donghyun
Saito, Kuniaki
Oh, Tae-Hyun
Plummer, Bryan A.
Sclaroff, Stan
Saenko, Kate
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9103 - 9112
[25] FALL DETECTION USING SELF-SUPERVISED PRE-TRAINING MODEL
Yhdego, Haben
Audette, Michel
Paolini, Christopher
PROCEEDINGS OF THE 2022 ANNUAL MODELING AND SIMULATION CONFERENCE (ANNSIM'22), 2022, : 361 - 371
[26] SLVP: Self-supervised Language-Video Pre-training for Referring Video Object Segmentation
Mei, Jie
Piergiovanni, A. J.
Hwang, Jenq-Neng
Li, Wei
2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 507 - 517
[27] SPAKT: A Self-Supervised Pre-TrAining Method for Knowledge Tracing
Ma, Yuling
Han, Peng
Qiao, Huiyan
Cui, Chaoran
Yin, Yilong
Yu, Dehu
IEEE ACCESS, 2022, 10 : 72145 - 72154
[28] MEASURING THE IMPACT OF DOMAIN FACTORS IN SELF-SUPERVISED PRE-TRAINING
Sanabria, Ramon
Wei-Ning, Hsu
Alexei, Baevski
Auli, Michael
2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
[29] Contrastive Self-Supervised Pre-Training for Video Quality Assessment
Chen, Pengfei
Li, Leida
Wu, Jinjian
Dong, Weisheng
Shi, Guangming
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 458 - 471
[30] A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis
Zhou, Hong-Yu
Lu, Chixiang
Chen, Chaoqi
Yang, Sibei
Yu, Yizhou
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8020 - 8035

← 1 2 3 4 5 →