A Multi-Scale Spatial-Temporal Attention Model for Person Re-Identification in Videos

被引:35
|
作者
Zhang, Wei [1 ]
He, Xuanyu [1 ]
Yu, Xiaodong [1 ]
Lu, Weizhi [1 ]
Zha, Zhengjun [2 ]
Tian, Qi [3 ]
机构
[1] Shandong Univ, Sch Control Sci & Engn, Jinan 250100, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230052, Peoples R China
[3] Univ Texas San Antonio, Dept Comp Sci, San Antonio, TX 78249 USA
基金
中国国家自然科学基金;
关键词
Video-based person re-id; spatial-temporal attention; multi-scale pooling;
D O I
10.1109/TIP.2019.2959653
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel deep neural network based attention model to learn the representative local regions from a video sequence for person re-identification. Specifically, we propose a multi-scale spatial-temporal attention (MSTA) model to measure the regions of each frame in different scales from the perspective of whole video sequence. Compared to traditional temporal attention models, MSTA focuses on exploiting the importance of local regions of each frame to the whole video representation in both spatial and temporal domains. A new training strategy is designed for the proposed model by incorporating the image-to-image mode with the video-to-video mode. Extensive experiments on benchmark datasets demonstrate the superiority of the proposed model over state-of-the-art methods.
引用
收藏
页码:3365 / 3373
页数:9
相关论文
共 50 条
  • [1] MULTI-SCALE SPATIAL-TEMPORAL NETWORK FOR PERSON RE-IDENTIFICATION
    Wang, Zhikang
    He, Lihuo
    Gao, Xinbo
    Huang, Yuanfei
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2052 - 2056
  • [2] Spatial-Temporal Person Re-Identification
    Wang, Guangcong
    Lai, Jianhuang
    Huang, Peigen
    Xie, Xiaohua
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8933 - 8940
  • [3] Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos
    Liu, Jiawei
    Zha, Zheng-Jun
    Wu, Wei
    Zheng, Kecheng
    Sun, Qibin
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4368 - 4377
  • [4] COMPLEX SPATIAL-TEMPORAL ATTENTION AGGREGATION FOR VIDEO PERSON RE-IDENTIFICATION
    Ding, Wenjie
    Wei, Xing
    Hong, Xiaopeng
    Gong, Yihong
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2441 - 2445
  • [5] Pose-aware Person Re-Identification with Spatial-temporal Attention
    Zhu, Qi
    2019 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE APPLICATIONS AND TECHNOLOGIES (AIAAT 2019), 2019, 646
  • [6] Attention Deep Model With Multi-Scale Deep Supervision for Person Re-Identification
    Wu, Di
    Wang, Chao
    Wu, Yong
    Wang, Qi-Cong
    Huang, De-Shuang
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (01): : 70 - 78
  • [7] An efficient multi-scale channel attention network for person re-identification
    Qian Luo
    Jie Shao
    Wanli Dang
    Long Geng
    Huaiyu Zheng
    Chang Liu
    The Visual Computer, 2024, 40 : 3515 - 3527
  • [8] LOCAL TO GLOBAL WITH MULTI-SCALE ATTENTION NETWORK FOR PERSON RE-IDENTIFICATION
    Sun, Lingchuan
    Liu, Jianlei
    Zhu, Yingxin
    Jiang, Zhuqing
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2254 - 2258
  • [9] An Efficient Multi-Scale Focusing Attention Network for Person Re-Identification
    Huang, Wei
    Li, Yongying
    Zhang, Kunlin
    Hou, Xiaoyu
    Xu, Jihui
    Su, Ruidan
    Xu, Huaiyu
    APPLIED SCIENCES-BASEL, 2021, 11 (05): : 1 - 16
  • [10] An efficient multi-scale channel attention network for person re-identification
    Luo, Qian
    Shao, Jie
    Dang, Wanli
    Geng, Long
    Zheng, Huaiyu
    Liu, Chang
    VISUAL COMPUTER, 2024, 40 (05): : 3515 - 3527