3-D PersonVLAD: Learning Deep Global Representations for Video-Based Person Reidentification

被引:79
|
作者
Wu, Lin [1 ,2 ]
Wang, Yang [1 ,3 ]
Shao, Ling [4 ]
Wang, Meng [1 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230000, Anhui, Peoples R China
[2] Univ Queensland, Brisbane, Qld 4072, Australia
[3] Dalian Univ Technol, Fac Elect Engn, Dalian 116024, Peoples R China
[4] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
中国国家自然科学基金;
关键词
Feature extraction; Spatiotemporal phenomena; Neural networks; Streaming media; Solid modeling; Learning systems; Computer science; 3-D convolution; global representations; person reidentification (re-ID); vector of local aggregated descriptors (VLAD);
D O I
10.1109/TNNLS.2019.2891244
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present the global deep video representation learning to video-based person reidentification (re-ID) that aggregates local 3-D features across the entire video extent. Existing methods typically extract frame-wise deep features from 2-D convolutional networks (ConvNets) which are pooled temporally to produce the video-level representations. However, 2-D ConvNets lose temporal priors immediately after the convolutions, and a separate temporal pooling is limited in capturing human motion in short sequences. In this paper, we present global video representation learning, to be complementary to 3-D ConvNets as a novel layer to capture the appearance and motion dynamics in full-length videos. Nevertheless, encoding each video frame in its entirety and computing aggregate global representations across all frames is tremendously challenging due to the occlusions and misalignments. To resolve this, our proposed network is further augmented with the 3-D part alignment to learn local features through the soft-attention module. These attended features are statistically aggregated to yield identity-discriminative representations. Our global 3-D features are demonstrated to achieve the state-of-the-art results on three benchmark data sets: MARS, Imagery Library for Intelligent Detection Systems-Video Re-identification, and PRID2011.
引用
收藏
页码:3347 / 3359
页数:13
相关论文
共 50 条
  • [1] Learning Deep Representations for Video-Based Intake Gesture Detection
    Rouast, Philipp V.
    Adam, Marc T. P.
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (06) : 1727 - 1737
  • [2] Spatiotemporal Interaction Transformer Network for Video-Based Person Reidentification in Internet of Things
    Yang, Fan
    Li, Wei
    Liang, Binbin
    Zhang, Jianwei
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (14) : 12537 - 12547
  • [3] Learning Recurrent 3D Attention for Video-Based Person Re-Identification
    Chen, Guangyi
    Lu, Jiwen
    Yang, Ming
    Zhou, Jie
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 6963 - 6976
  • [4] Multiview Video-Based 3-D Hand Pose Estimation
    Khaleghi L.
    Sepas-Moghaddam A.
    Marshall J.
    Etemad A.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (04): : 896 - 909
  • [5] CARF-Net: CNN attention and RNN fusion network for video-based person reidentification
    Kansal, Kajal
    Venkata, Subramanyam
    Prasad, Dilip K.
    Kankanhalli, Mohan
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
  • [6] Deep asymmetric video-based person re-identification
    Meng, Jingke
    Wu, Ancong
    Zheng, Wei-Shi
    PATTERN RECOGNITION, 2019, 93 : 430 - 441
  • [7] Deep Learning for Video-Based Assessment in Surgery
    Yanik, Erim
    Schwaitzberg, Steven
    De, Suvranu
    JAMA SURGERY, 2024, 159 (08) : 957 - 958
  • [8] Few-Shot Deep Adversarial Learning for Video-Based Person Re-Identification
    Wu, Lin
    Wang, Yang
    Yin, Hongzhi
    Wang, Meng
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 1233 - 1245
  • [9] Progressive learning in cross-modal cross-scale fusion transformer for visible-infrared video-based person reidentification
    Mukhtar, Hamza
    Mukhtar, Umar Raza
    KNOWLEDGE-BASED SYSTEMS, 2024, 304
  • [10] LOMO3D DESCRIPTOR FOR VIDEO-BASED PERSON RE-IDENTIFICATION
    Zheng, Sutong
    Li, Xiaoyu
    Jiang, Zhuqing
    Guo, Xiaoqiang
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 672 - 676