Towards Real-World Visual Tracking With Temporal Contexts

被引:23
|
作者
Cao, Ziang [1 ]
Huang, Ziyuan [2 ]
Pan, Liang [1 ]
Zhang, Shiwei [3 ]
Liu, Ziwei [1 ]
Fu, Changhong [4 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[2] Natl Univ Singapore, Dept Mech Engn, Singapore 119077, Singapore
[3] DAMO Acad, Alibaba Grp, Hangzhou 310052, Zhejiang, Peoples R China
[4] Tongji Univ, Sch Mech Engn, Shanghai 201804, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
Latency-aware evaluations; real-world tests; temporal contexts; two-level framework; visual tracking; PLUS PLUS; NETWORK;
D O I
10.1109/TPAMI.2023.3307174
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual tracking has made significant improvements in the past few decades. Most existing state-of-the-art trackers 1) merely aim for performance in ideal conditions while overlooking the real-world conditions; 2) adopt the tracking-by-detection paradigm, neglecting rich temporal contexts; 3) only integrate the temporal information into the template, where temporal contexts among consecutive frames are far from being fully utilized. To handle those problems, we propose a two-level framework (TCTrack) that can exploit temporal contexts efficiently. Based on it, we propose a stronger version for real-world visual tracking, i.e., TCTrack++. It boils down to two levels: features and similarity maps. Specifically, for feature extraction, we propose an attention-based temporally adaptive convolution to enhance the spatial features using temporal information, which is achieved by dynamically calibrating the convolution weights. For similarity map refinement, we introduce an adaptive temporal transformer to encode the temporal knowledge efficiently and decode it for the accurate refinement of the similarity map. To further improve the performance, we additionally introduce a curriculum learning strategy. Also, we adopt online evaluation to measure performance in real-world conditions. Exhaustive experiments on 8 well-known benchmarks demonstrate the superiority of TCTrack++. Real-world tests directly verify that TCTrack++ can be readily used in real-world applications.
引用
收藏
页码:15834 / 15849
页数:16
相关论文
共 50 条
  • [41] Anticipation in Real-World Scenes: The Role of Visual Context and Visual Memory
    Coco, Moreno I.
    Keller, Frank
    Malcolm, George L.
    COGNITIVE SCIENCE, 2016, 40 (08) : 1995 - 2024
  • [42] Real-World Battles with Real-World Data
    Brown, Jeffrey
    Bate, Andrew
    Platt, Robert
    Raebel, Marsha
    Sauer, Brian
    Trifiro, Gianluca
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2017, 26 : 254 - 255
  • [43] Emotional real-world scenes impact visual search
    Bendall, Robert C. A.
    Mohamed, Aisha
    Thompson, Catherine
    COGNITIVE PROCESSING, 2019, 20 (03) : 309 - 316
  • [44] The attraction of visual attention to texts in real-world scenes
    Wang, Hsueh-Cheng
    Pomplun, Marc
    JOURNAL OF VISION, 2012, 12 (06):
  • [45] Why is real-world visual object recognition hard?
    Pinto, Nicolas
    Cox, David D.
    DiCarlo, James J.
    PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (01) : 0151 - 0156
  • [46] Emotional real-world scenes impact visual search
    Robert C. A. Bendall
    Aisha Mohamed
    Catherine Thompson
    Cognitive Processing, 2019, 20 : 309 - 316
  • [47] The Role of Real-World Statistical Regularities in Visual Perception
    Beck, Diane M.
    Center, Evan G.
    Shao, Zhenan
    CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE, 2024, 33 (05) : 317 - 324
  • [48] Disentangling visual imagery and perception of real-world objects
    Lee, Sue-Hyun
    Kravitz, Dwight J.
    Baker, Chris I.
    NEUROIMAGE, 2012, 59 (04) : 4064 - 4073
  • [49] Cognitive robots learning failure contexts through real-world experimentation
    Karapinar, Sertac
    Sariel, Sanem
    AUTONOMOUS ROBOTS, 2015, 39 (04) : 469 - 485
  • [50] Sustainability-oriented labs in real-world contexts: An exploratory review
    McCrory, Gavin
    Schapke, Niko
    Holmen, Johan
    Holmberg, John
    JOURNAL OF CLEANER PRODUCTION, 2020, 277 (277)