Temporal Correlation Vision Transformer for Video Person Re-Identification

被引:0
|
作者
Wu, Pengfei [1 ,2 ]
Wang, Le [1 ,2 ]
Zhou, Sanping [1 ,2 ]
Hua, Gang [4 ]
Sun, Changyin [3 ]
机构
[1] Xi An Jiao Tong Univ, Natl Engn Res Ctr Visual Informat & Applicat, Natl Key Lab Human Machine Hybrid Augmented Intel, Xian, Peoples R China
[2] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Peoples R China
[3] Anhui Univ, Sch Artificial Intelligence, Hefei, Peoples R China
[4] Wormpex AI Res, Bellevue, WA USA
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video Person Re-Identification (Re-ID) is a task of retrieving persons from multi-camera surveillance systems. Despite the progress made in leveraging spatio-temporal information in videos, occlusion in dense crowds still hinders further progress. To address this issue, we propose a Temporal Correlation Vision Transformer (TCViT) for video person Re-ID. TCViT consists of a Temporal Correlation Attention (TCA) module and a Learnable Temporal Aggregation (LTA) module. The TCA module is designed to reduce the impact of non-target persons by relative state, while the LTA module is used to aggregate frame-level features based on their completeness. Specifically, TCA is a parameter-free module that first aligns frame-level features to restore semantic coherence in videos and then enhances the features of the target person according to temporal correlation. Additionally, unlike previous methods that treat each frame equally with a pooling layer, LTA introduces a lightweight learnable module to weigh and aggregate frame-level features under the guidance of a classification score. Extensive experiments on four prevalent benchmarks demonstrate that our method achieves state-of-the-art performance in video Re-ID.
引用
收藏
页码:6083 / 6091
页数:9
相关论文
共 50 条
  • [1] Vision transformer with multiple granularities for person re-identification
    Bingcai Chen
    Fansheng Zhang
    Xin Yang
    Qian Ning
    Victor C. M. Leung
    Neural Computing and Applications, 2023, 35 : 23213 - 23223
  • [2] Vision transformer with multiple granularities for person re-identification
    Chen, Bingcai
    Zhang, Fansheng
    Yang, Xin
    Ning, Qian
    Leung, Victor C. M.
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (31): : 23213 - 23223
  • [3] Hybrid Vision Transformer for Domain Adaptable Person Re-identification
    Waseem, Muhammad Danish
    Tahir, Muhammad Atif
    Durrani, Muhammad Nouman
    ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 1463 : 114 - 122
  • [4] Enhance Heads in Vision Transformer for Occluded Person Re-Identification
    Han, Shoudong
    Zhang, Ziwen
    Yuan, Xinpeng
    Ming, Delie
    IEEE SENSORS JOURNAL, 2025, 25 (04) : 6894 - 6904
  • [5] Video Person Re-Identification by Temporal Residual Learning
    Dai, Ju
    Zhang, Pingping
    Wang, Dong
    Lu, Huchuan
    Wang, Hongyu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (03) : 1366 - 1377
  • [6] Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person Re-Identification
    Tang, Ziyi
    Zhang, Ruimao
    Peng, Zhanglin
    Chen, Jinrui
    Lin, Liang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7917 - 7929
  • [7] Short range correlation transformer for occluded person re-identification
    Zhao, Yunbin
    Zhu, Songhao
    Wang, Dongsheng
    Liang, Zhiwei
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (20): : 17633 - 17645
  • [8] Temporal Correlation-Diversity Representations for Video-Based Person Re-Identification
    Gong, Litong
    Zhang, Ruize
    Tang, Sheng
    Cao, Juan
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, PRCV 2022, 2022, 13534 : 94 - 105
  • [9] Short range correlation transformer for occluded person re-identification
    Yunbin Zhao
    Songhao Zhu
    Dongsheng Wang
    Zhiwei Liang
    Neural Computing and Applications, 2022, 34 : 17633 - 17645
  • [10] Vision Transformer with hierarchical structure and windows shifting for person re-identification
    Zhang, Yinghua
    Hou, Wei
    PLOS ONE, 2023, 18 (06):