Spatiotemporal Interaction Transformer Network for Video-Based Person Reidentification in Internet of Things

被引:3
|
作者
Yang, Fan [1 ]
Li, Wei [2 ,3 ]
Liang, Binbin [2 ]
Zhang, Jianwei [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Sichuan Univ, Sch Aeronaut & Astronaut, Chengdu 610065, Peoples R China
[3] Beijing Inst Technol, State Key Lab Explos Sci & Technol, Beijing 100081, Peoples R China
关键词
Internet of Things; local feature; person reidentification (Re-ID); spatiotemporal interaction; REPRESENTATION; ATTENTION; APPEARANCE;
D O I
10.1109/JIOT.2023.3250652
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video-based person reidentification, which is a significant application in the Internet of Things, aims to identify the same person in different video sequences across nonoverlapping cameras. Existing methods usually utilize temporal cues to enhance spatial features. However, these methods learn the temporal and spatial information separately, which breaks the relationship between them and ignores the positive role of temporal information for learning frame-level spatial representation in the process of spatial representation learning. In this article, we propose a novel spatiotemporal interaction transformer network (SITN) to solve this problem. To model the temporal information and the relationship between frames, we introduce a temporal interaction module (TIM) to interact between frame information. Meanwhile, we combine TIM with spatial transformer encoder to explore the positive role of temporal information in the learning procedure of the frame-level spatial feature. Moreover, we propose a transformer local learning scheme by reconstructing the 2-D spatial information of the frame patch sequences and extracting local features in a striped manner to strengthen the discriminative capability of our model. Extensive experiments are conducted on four public benchmarks. The results show that our model is superior compared with state-of-the-art methods.
引用
收藏
页码:12537 / 12547
页数:11
相关论文
共 50 条
  • [21] Dense Interaction Learning for Video-based Person Re-identification
    He, Tianyu
    Jin, Xin
    Shen, Xu
    Huang, Jianqiang
    Chen, Zhibo
    Hua, Xian-Sheng
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1470 - 1481
  • [22] STAN: spatiotemporal attention network for video-based facial expression recognition
    Yufan Yi
    Yiping Xu
    Ziyi Ye
    Linhui Li
    Xinli Hu
    Yan Tian
    The Visual Computer, 2023, 39 : 6205 - 6220
  • [23] Pose-guided spatiotemporal alignment for video-based person Re-identification
    Gao, Changxin
    Chen, Yang
    Yu, Jin-Gang
    Sang, Nong
    INFORMATION SCIENCES, 2020, 527 : 176 - 190
  • [24] STAN: spatiotemporal attention network for video-based facial expression recognition
    Yi, Yufan
    Xu, Yiping
    Ye, Ziyi
    Li, Linhui
    Hu, Xinli
    Tian, Yan
    VISUAL COMPUTER, 2023, 39 (12): : 6205 - 6220
  • [25] Triplet Attention Network for Video-Based Person Re-Identification
    Sun, Rui
    Liang, Qili
    Yang, Zi
    Zhao, Zhenghui
    Zhang, Xudong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (10) : 1775 - 1779
  • [26] Recurrent Convolutional Network for Video-based Person Re-Identification
    McLaughlin, Niall
    del Rincon, Jesus Martinez
    Miller, Paul
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1325 - 1334
  • [27] An Efficient Graph Transformer Network for Video-Based Human Mesh Reconstruction
    Tang, Tao
    You, Yingxuan
    Wang, Ti
    Liu, Hong
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT I, 2024, 14473 : 207 - 219
  • [28] PVTReID: A Quick Person Reidentification-Based Pyramid Vision Transformer
    Han, Ke
    Wang, Qianlong
    Zhu, Mingming
    Zhang, Xiyan
    APPLIED SCIENCES-BASEL, 2023, 13 (17):
  • [29] Multiscale Aligned SpatialTemporal Interaction for Video-Based Person Re-Identification
    Ran, Zhidan
    Wei, Xuan
    Liu, Wei
    Lu, Xiaobo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (09) : 8536 - 8546
  • [30] Homogeneous and Heterogeneous Optimization for Unsupervised Cross-Modality Person Reidentification in Visual Internet of Things
    Si, Tongzhen
    He, Fazhi
    Li, Penglei
    Ye, Mang
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (07) : 12165 - 12176