PersonMAE: Person Re-Identification Pre-Training With Masked AutoEncoders

被引:0
|
作者
Hu, Hezhen [1 ]
Dong, Xiaoyi [1 ]
Bao, Jianmin [2 ]
Chen, Dongdong [3 ]
Yuan, Lu
Chen, Dong [2 ]
Li, Houqiang [1 ]
机构
[1] Univ Sci & Technol China, Hefei 230027, Peoples R China
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
[3] Microsoft, Redmond, WA 98052 USA
关键词
Task analysis; Pedestrians; Semantics; Robustness; Decoding; Visualization; Image color analysis; High-quality ReID representation; masked autoencoder; pre-training; NETWORK; GAN;
D O I
10.1109/TMM.2024.3405649
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID). We argue that a high-quality ReID representation should have three properties, namely, multi-level awareness, occlusion robustness, and cross-region invariance. To this end, we propose a simple yet effective pre-training framework, namely PersonMAE, which involves two core designs into masked autoencoders to better serve the task of Person Re-ID. 1) PersonMAE generates two regions from the given image with RegionA as the input and RegionB as the prediction target. RegionA is corrupted with block-wise masking to mimic common occlusion in ReID and its remaining visible parts are fed into the encoder. 2) Then PersonMAE aims to predict the whole RegionB at both pixel level and semantic feature level. It encourages its pre-trained feature representations with the three properties mentioned above. These properties make PersonMAE compatible with downstream Person ReID tasks, leading to State-of-the-Art performance on four downstream ReID tasks, i.e., supervised (holistic and occluded setting), and unsupervised (UDA and USL setting). Notably, on the commonly adopted supervised setting, PersonMAE with ViT-B backbone achieves 79.8% and 69.5% mAP on the MSMT17 and OccDuke datasets, surpassing the previous State-of-the-Art by a large margin of +8.0 mAP, and +5.3 mAP, respectively.
引用
收藏
页码:10029 / 10040
页数:12
相关论文
共 50 条
  • [31] Masked Attribute Description Embedding for Cloth-Changing Person Re-Identification
    Peng, Chunlei
    Wang, Boyu
    Liu, Decheng
    Wang, Nannan
    Hu, Ruimin
    Gao, Xinbo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1475 - 1485
  • [32] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
    Tong, Zhan
    Song, Yibing
    Wang, Jue
    Wang, Limin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [33] Self directed training of person re-identification with synthetic data
    Dant, Aaron P.
    Kacenjar, Steve T.
    Neely, Ronald
    APPLICATIONS OF MACHINE LEARNING 2021, 2021, 11843
  • [34] Deep Person Re-Identification with Improved Embedding and Efficient Training
    Jin, Haibo
    Wang, Xiaobo
    Liao, Shengcai
    Li, Stan Z.
    2017 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB), 2017, : 261 - 267
  • [35] Learning fused features with parallel training for person re-identification
    Li, Xuan
    Zhang, Tao
    Zhao, Xin
    Sun, Xing
    Yi, Zhengming
    KNOWLEDGE-BASED SYSTEMS, 2021, 220
  • [36] Re-identification methods for masked microdata
    Winkler, WE
    PRIVACY IN STATISTICAL DATABASES, PROCEEDINGS, 2004, 3050 : 216 - 230
  • [37] Person Re-identification in the Wild
    Zheng, Liang
    Zhang, Hengheng
    Sun, Shaoyan
    Chandraker, Manmohan
    Yang, Yi
    Tian, Qi
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3346 - 3355
  • [38] OCCLUDED PERSON RE-IDENTIFICATION
    Zhuo, Jiaxuan
    Chen, Zeyu
    Lai, Jianhuang
    Wang, Guangcong
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [39] Person Re-identification by Attributes
    Layne, Ryan
    Hospedales, Timothy
    Gong, Shaogang
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
  • [40] Person re-identification in crowd
    Mazzon, Riccardo
    Tahir, Syed Fahad
    Cavallaro, Andrea
    PATTERN RECOGNITION LETTERS, 2012, 33 (14) : 1828 - 1837