Spatiotemporal Interaction Transformer Network for Video-Based Person Reidentification in Internet of Things

被引:3
|
作者
Yang, Fan [1 ]
Li, Wei [2 ,3 ]
Liang, Binbin [2 ]
Zhang, Jianwei [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Sichuan Univ, Sch Aeronaut & Astronaut, Chengdu 610065, Peoples R China
[3] Beijing Inst Technol, State Key Lab Explos Sci & Technol, Beijing 100081, Peoples R China
关键词
Internet of Things; local feature; person reidentification (Re-ID); spatiotemporal interaction; REPRESENTATION; ATTENTION; APPEARANCE;
D O I
10.1109/JIOT.2023.3250652
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video-based person reidentification, which is a significant application in the Internet of Things, aims to identify the same person in different video sequences across nonoverlapping cameras. Existing methods usually utilize temporal cues to enhance spatial features. However, these methods learn the temporal and spatial information separately, which breaks the relationship between them and ignores the positive role of temporal information for learning frame-level spatial representation in the process of spatial representation learning. In this article, we propose a novel spatiotemporal interaction transformer network (SITN) to solve this problem. To model the temporal information and the relationship between frames, we introduce a temporal interaction module (TIM) to interact between frame information. Meanwhile, we combine TIM with spatial transformer encoder to explore the positive role of temporal information in the learning procedure of the frame-level spatial feature. Moreover, we propose a transformer local learning scheme by reconstructing the 2-D spatial information of the frame patch sequences and extracting local features in a striped manner to strengthen the discriminative capability of our model. Extensive experiments are conducted on four public benchmarks. The results show that our model is superior compared with state-of-the-art methods.
引用
收藏
页码:12537 / 12547
页数:11
相关论文
共 50 条
  • [41] Multimodal Interaction Fusion Network Based on Transformer for Video Captioning
    Xu, Hui
    Zeng, Pengpeng
    Khan, Abdullah Aman
    ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2022, PT I, 2022, 1700 : 21 - 36
  • [42] Video-based person re-identification with complementary local and global features using a graph transformer
    Lu, Hai
    Luo, Enbo
    Feng, Yong
    Wang, Yifan
    Mathematical Biosciences and Engineering, 2024, 21 (07): : 6694 - 6709
  • [43] Parallel Attention with Weighted Efficient Network for Video-Based Person Re-Identification
    Yang, Junting
    Yang, Zuliu
    Zhou, Jing
    Zhao, Yong
    Dai, Qifei
    Li, Fuchi
    2021 5TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE (ICIAI 2021), 2021, : 133 - 139
  • [44] Video-based person re-identification with scene and person attributes
    Gong, Xun
    Luo, Bin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 8117 - 8128
  • [45] Temporal Attention Quality Aware Network for Video-based Person Re-Identification
    Xu, Boqin
    Liu, Changhong
    Xue, Shengjun
    Jiang, Aiwen
    Wang, Shimin
    Ye, Jihua
    TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
  • [46] Temporal-Contextual Attention Network for Video-Based Person Re-identification
    Chen, Di
    Zha, Zheng-Jun
    Liu, Jiawei
    Xie, Hongtao
    Zhang, Yongdong
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 146 - 157
  • [47] An Efficient Axial-Attention Network for Video-Based Person Re-Identification
    Zhang, Fuping
    Zhang, Tianzhao
    Sun, Ruoxi
    Huang, Chao
    Wei, Jianming
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1352 - 1356
  • [48] Spatial-temporal aware network for video-based person re-identification
    Jun Wang
    Qi Zhao
    Di Jia
    Ziqing Huang
    Miaohui Zhang
    Xing Ren
    Multimedia Tools and Applications, 2024, 83 : 36355 - 36373
  • [49] Spatial temporal and channel aware network for video-based person re-identification
    Fu, Hui
    Zhang, Ke
    Li, Haoyu
    Wang, Jingyu
    Wang, Zhen
    IMAGE AND VISION COMPUTING, 2022, 118
  • [50] Video-based person re-identification with scene and person attributes
    Xun Gong
    Bin Luo
    Multimedia Tools and Applications, 2024, 83 : 8117 - 8128