Spatio-temporal interactive fusion based visual object tracking method

被引:0
|
作者
Huang, Dandan [1 ]
Yu, Siyu [1 ]
Duan, Jin [1 ]
Wang, Yingzhi [1 ]
Yao, Anni [1 ]
Wang, Yiwen [1 ]
Xi, Junhan [1 ]
机构
[1] Changchun Univ Sci & Technol, Coll Elect Informat Engn, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
object tracking; spatio-temporal context; feature enhancement; feature fusion; attention mechanism;
D O I
10.3389/fphy.2023.1269638
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Visual object tracking tasks often struggle with utilizing inter-frame correlation information and handling challenges like local occlusion, deformations, and background interference. To address these issues, this paper proposes a spatio-temporal interactive fusion (STIF) based visual object tracking method. The goal is to fully utilize spatio-temporal background information, enhance feature representation for object recognition, improve tracking accuracy, adapt to object changes, and reduce model drift. The proposed method incorporates feature-enhanced networks in both temporal and spatial dimensions. It leverages spatio-temporal background information to extract salient features that contribute to improved object recognition and tracking accuracy. Additionally, the model's adaptability to object changes is enhanced, and model drift is minimized. A spatio-temporal interactive fusion network is employed to learn a similarity metric between the memory frame and the query frame by utilizing feature enhancement. This fusion network effectively filters out stronger feature representations through the interactive fusion of information. The proposed tracking method is evaluated on four challenging public datasets. The results demonstrate that the method achieves state-of-the-art (SOTA) performance and significantly improves tracking accuracy in complex scenarios affected by local occlusion, deformations, and background interference. Finally, the method achieves a remarkable success rate of 78.8% on TrackingNet, a large-scale tracking dataset.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] SPATIO-TEMPORAL CORRELATION LEARNING FOR MULTIPLE OBJECT TRACKING
    Jian, Yajun
    Zhuang, Chihui
    He, Wenyan
    Du, Kaiwen
    Lu, Yang
    Wang, Hanzi
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6170 - 6174
  • [22] Visual interactive clustering and querying of spatio-temporal data
    Sourina, O
    Liu, DQ
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2005, VOL 4, PROCEEDINGS, 2005, 3483 : 968 - 977
  • [23] Object tracking via Spatio-Temporal Context learning based on multi-feature fusion in stationary scene
    Cheng, Yunfei
    Wang, Wu
    AOPC 2017: OPTICAL SENSING AND IMAGING TECHNOLOGY AND APPLICATIONS, 2017, 10462
  • [24] Interactive object extraction using spatio-temporal video segmentation
    Okubo, Hidehiko, 1600, Inst. of Image Information and Television Engineers (68):
  • [25] Online visual tracking by integrating spatio-temporal cues
    He, Yang
    Pei, Mingtao
    Yang, Min
    Wu, Yuwei
    Jia, Yunde
    IET COMPUTER VISION, 2015, 9 (01) : 124 - 137
  • [26] Learning spatio-temporal correlation filter for visual tracking
    Yan, Youmin
    Guo, Xixian
    Tang, Jin
    Li, Chenglong
    Wang, Xin
    NEUROCOMPUTING, 2021, 436 : 273 - 282
  • [27] Deep learning of spatio-temporal information for visual tracking
    Gwangmin Choe
    Ilmyong Son
    Chunhwa Choe
    Hyoson So
    Hyokchol Kim
    Gyongnam Choe
    Multimedia Tools and Applications, 2022, 81 : 17283 - 17302
  • [28] Adaptive spatio-temporal context learning for visual tracking
    Zhang, Yaqin
    Wang, Liejun
    Qin, Jiwei
    IMAGING SCIENCE JOURNAL, 2019, 67 (03): : 136 - 147
  • [29] HUMAN TRACKING & VISUAL SPATIO-TEMPORAL STATISTICAL ANALYSIS
    Ioannidis, D.
    Krinidis, S.
    Tzovaras, D.
    Likothanassis, S.
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 3417 - 3419
  • [30] Deep learning of spatio-temporal information for visual tracking
    Choe, Gwangmin
    Son, Ilmyong
    Choe, Chunhwa
    So, Hyoson
    Kim, Hyokchol
    Choe, Gyongnam
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (12) : 17283 - 17302