Multitask Multigranularity Aggregation With Global-Guided Attention for Video Person Re-Identification

被引:6
|
作者
Sun, Dengdi [1 ]
Huang, Jiale [2 ]
Hu, Lei [2 ]
Tang, Jin [3 ]
Ding, Zhuanlian [4 ]
机构
[1] Anhui Univ, Sch Artificial Intelligence, Key Lab Intelligent Comp & Signal Proc ICSP, Minist Educ, Hefei 230601, Peoples R China
[2] Anhui Univ, Wendian Coll, Hefei 230601, Peoples R China
[3] Anhui Univ, Sch Comp Sci & Technol, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
[4] Anhui Univ, Sch Internet, Hefei 230039, Peoples R China
关键词
Feature extraction; Multitasking; Video sequences; Task analysis; Data mining; Semantics; Convolutional neural networks; Person re-identification; video; multi-task; multi-granularity; attention mechanism; global feature; SET;
D O I
10.1109/TCSVT.2022.3183011
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The goal of video-based person re-identification (Re-ID) is to identify the same person across multiple non-overlapping cameras. The key to accomplishing this challenging task is to sufficiently exploit both spatial and temporal cues in video sequences. However, most current methods are incapable of accurately locating semantic regions or efficiently filtering discriminative spatio-temporal features; so it is difficult to handle issues such as spatial misalignment and occlusion. Thus, we propose a novel feature aggregation framework, multi-task and multi-granularity aggregation with global-guided attention (MMA-GGA), which aims to adaptively generate more representative spatio-temporal aggregation features. Specifically, we develop a multi-task multi-granularity aggregation (MMA) module to extract features at different locations and scales to identify key semantic-aware regions that are robust to spatial misalignment. Then, to determine the importance of the multi-granular semantic information, we propose a global-guided attention (GGA) mechanism to learn weights based on the global features of the video sequence, allowing our framework to identify stable local features while ignoring occlusions. Therefore, the MMA-GGA framework can efficiently and effectively capture more robust and representative features. Extensive experiments on four benchmark datasets demonstrate that our MMA-GGA framework outperforms current state-of-the-art methods. In particular, our method achieves a rank-1 accuracy of 91.0% on the MARS dataset, the most widely used database, significantly outperforming existing methods.
引用
收藏
页码:7758 / 7771
页数:14
相关论文
共 50 条
  • [21] Triplet Attention Network for Video-Based Person Re-Identification
    Sun, Rui
    Liang, Qili
    Yang, Zi
    Zhao, Zhenghui
    Zhang, Xudong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (10) : 1775 - 1779
  • [22] Person Re-identification by Video Ranking
    Wang, Taiqing
    Gong, Shaogang
    Zhu, Xiatian
    Wang, Shengjin
    COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 : 688 - 703
  • [23] Topology and channel affinity reinforced global attention for person re-identification
    Wang, Xile
    Gao, Chengcheng
    Xin, Ming
    Zhang, Sihan
    Zhang, Miaohui
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (09) : 5136 - 5160
  • [24] Multiscale Global-Aware Channel Attention for Person Re-identification
    Zhu, Yingjie
    Yang, Wenzhong
    Wang, Liejun
    Chen, Danny
    Wang, Min
    Wei, Fuyuan
    KeZiErBieKe, HaiLaTi
    Liao, Yuanyuan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [25] Global-Local Temporal Representations For Video Person Re-Identification
    Li, Jianing
    Wang, Jingdong
    Tian, Qi
    Gao, Wen
    Zhang, Shiliang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3957 - 3966
  • [26] Attention-guided spatial–temporal graph relation network for video-based person re-identification
    Yu Qi
    Hongwei Ge
    Wenbin Pei
    Yuxuan Liu
    Yaqing Hou
    Liang Sun
    Neural Computing and Applications, 2023, 35 : 14227 - 14241
  • [27] Feature Aggregation With Reinforcement Learning for Video-Based Person Re-Identification
    Zhang, Wei
    He, Xuanyu
    Lu, Weizhi
    Qiao, Hong
    Li, Yibin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (12) : 3847 - 3852
  • [28] Multitask Person Re-Identification using Homoscedastic Uncertainty Learning
    Tay, Chiat-Pin
    Roy, Sharmili
    Yap, Kim-Hui
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [29] Reliable Part Guided Multiple Level Attention Learning for Person Re-Identification
    Geng, Yanbing
    Lian, Yongjian
    Yang, Shunmin
    Zhou, Mingliang
    Cao, Jingchao
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (13)
  • [30] Pose matters: Pose guided graph attention network for person re-identification
    Zhijun HE
    Hongbo ZHAO
    Jianrong WANG
    Wenquan FENG
    Chinese Journal of Aeronautics , 2023, (05) : 447 - 464