Spatiotemporal Interaction Transformer Network for Video-Based Person Reidentification in Internet of Things

被引:3
|
作者
Yang, Fan [1 ]
Li, Wei [2 ,3 ]
Liang, Binbin [2 ]
Zhang, Jianwei [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Sichuan Univ, Sch Aeronaut & Astronaut, Chengdu 610065, Peoples R China
[3] Beijing Inst Technol, State Key Lab Explos Sci & Technol, Beijing 100081, Peoples R China
关键词
Internet of Things; local feature; person reidentification (Re-ID); spatiotemporal interaction; REPRESENTATION; ATTENTION; APPEARANCE;
D O I
10.1109/JIOT.2023.3250652
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video-based person reidentification, which is a significant application in the Internet of Things, aims to identify the same person in different video sequences across nonoverlapping cameras. Existing methods usually utilize temporal cues to enhance spatial features. However, these methods learn the temporal and spatial information separately, which breaks the relationship between them and ignores the positive role of temporal information for learning frame-level spatial representation in the process of spatial representation learning. In this article, we propose a novel spatiotemporal interaction transformer network (SITN) to solve this problem. To model the temporal information and the relationship between frames, we introduce a temporal interaction module (TIM) to interact between frame information. Meanwhile, we combine TIM with spatial transformer encoder to explore the positive role of temporal information in the learning procedure of the frame-level spatial feature. Moreover, we propose a transformer local learning scheme by reconstructing the 2-D spatial information of the frame patch sequences and extracting local features in a striped manner to strengthen the discriminative capability of our model. Extensive experiments are conducted on four public benchmarks. The results show that our model is superior compared with state-of-the-art methods.
引用
收藏
页码:12537 / 12547
页数:11
相关论文
共 50 条
  • [31] Multi-Granularity Aggregation with Spatiotemporal Consistency for Video-Based Person Re-Identification
    Lee, Hean Sung
    Kim, Minjung
    Jang, Sungjun
    Bae, Han Byeol
    Lee, Sangyoun
    SENSORS, 2024, 24 (07)
  • [32] Diverse part attentive network for video-based person re-identification *
    Shu, Xiujun
    Li, Ge
    Wei, Longhui
    Zhong, Jia-Xing
    Zang, Xianghao
    Zhang, Shiliang
    Wang, Yaowei
    Liang, Yongsheng
    Tian, Qi
    PATTERN RECOGNITION LETTERS, 2021, 149 : 17 - 23
  • [33] SANet: Statistic Attention Network for Video-Based Person Re-Identification
    Bai, Shutao
    Ma, Bingpeng
    Chang, Hong
    Huang, Rui
    Shan, Shiguang
    Chen, Xilin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3866 - 3879
  • [34] Context Sensing Attention Network for Video-based Person Re-identification
    Wang, Kan
    Ding, Changxing
    Pang, Jianxin
    Xu, Xiangmin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (04)
  • [35] Spatial Quality Aware Network for Video-Based Person Re-identification
    Wang, Yujie
    Leng, Biao
    Song, Guanglu
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 34 - 43
  • [36] Frequency Information Disentanglement Network for Video-Based Person Re-Identification
    Liu, Liangchen
    Yang, Xi
    Wang, Nannan
    Gao, Xinbo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4287 - 4298
  • [37] Learning to Recognize Video-Based Spatiotemporal Events
    Veeraraghavan, Harini
    Papanikolopoulos, Nikolaos P.
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2009, 10 (04) : 628 - 638
  • [38] VIDEO-BASED PERSON AUTHETICATION WITH RANDOM PASSWORDS
    Liao, Chia-Wei
    Lin, Wei-Yang
    Lin, Chia-Wen
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 581 - +
  • [39] Multi-Agent based Framework for Person ReIdentification in Video Surveillance
    Al Rahbi, Muna Saif
    Edirisinghe, Eran
    Fatima, Shaheen
    PROCEEDINGS OF 2016 FUTURE TECHNOLOGIES CONFERENCE (FTC), 2016, : 1349 - 1352
  • [40] Visible Thermal Person Reidentification via Mutual Learning Convolutional Neural Network in 6G-Enabled Visual Internet of Things
    Zhang, Zhong
    Wang, Sen
    IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (20) : 15259 - 15266