PersonMAE: Person Re-Identification Pre-Training With Masked AutoEncoders

被引:0
|
作者
Hu, Hezhen [1 ]
Dong, Xiaoyi [1 ]
Bao, Jianmin [2 ]
Chen, Dongdong [3 ]
Yuan, Lu
Chen, Dong [2 ]
Li, Houqiang [1 ]
机构
[1] Univ Sci & Technol China, Hefei 230027, Peoples R China
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
[3] Microsoft, Redmond, WA 98052 USA
关键词
Task analysis; Pedestrians; Semantics; Robustness; Decoding; Visualization; Image color analysis; High-quality ReID representation; masked autoencoder; pre-training; NETWORK; GAN;
D O I
10.1109/TMM.2024.3405649
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID). We argue that a high-quality ReID representation should have three properties, namely, multi-level awareness, occlusion robustness, and cross-region invariance. To this end, we propose a simple yet effective pre-training framework, namely PersonMAE, which involves two core designs into masked autoencoders to better serve the task of Person Re-ID. 1) PersonMAE generates two regions from the given image with RegionA as the input and RegionB as the prediction target. RegionA is corrupted with block-wise masking to mimic common occlusion in ReID and its remaining visible parts are fed into the encoder. 2) Then PersonMAE aims to predict the whole RegionB at both pixel level and semantic feature level. It encourages its pre-trained feature representations with the three properties mentioned above. These properties make PersonMAE compatible with downstream Person ReID tasks, leading to State-of-the-Art performance on four downstream ReID tasks, i.e., supervised (holistic and occluded setting), and unsupervised (UDA and USL setting). Notably, on the commonly adopted supervised setting, PersonMAE with ViT-B backbone achieves 79.8% and 69.5% mAP on the MSMT17 and OccDuke datasets, surpassing the previous State-of-the-Art by a large margin of +8.0 mAP, and +5.3 mAP, respectively.
引用
收藏
页码:10029 / 10040
页数:12
相关论文
共 50 条
  • [41] Person in Uniforms Re-Identification
    Xiang, Chong-yang
    Wu, Xiao
    He, Jun-Yan
    Yuan, Zhaoquan
    He, Tingquan
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2025, 21 (02)
  • [42] Partial Person Re-identification
    Zheng, Wei-Shi
    Li, Xiang
    Xiang, Tao
    Liao, Shengcai
    Lai, Jianhuang
    Gong, Shaogang
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4678 - 4686
  • [43] Color Based Pre-rank Categorization for Person Re-identification
    Shah, Jamal Hussain
    Chen, Zonghai
    Rehman, Saeed Ur
    Raza, Mudassar
    Lin Mingqiang
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND COMPUTER APPLICATION, 2016, 30 : 293 - 296
  • [44] Re-Ranking For Person Re-Identification
    Vu-Hoang Nguyen
    Thanh Duc Ngo
    Nguyen, Khang M. T. T.
    Duc Anh Duong
    Kien Nguyen
    Duy-Dinh Le
    2013 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2013, : 304 - 308
  • [45] Distance based Training for Cross-Modality Person Re-Identification
    Tekeli, Nihat
    Can, Ahmet Burak
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4540 - 4549
  • [46] Staged encoder training for cross-camera person re-identification
    Zhi Xu
    Jiawei Yang
    Yuxuan Liu
    Longyang Zhao
    Jiajia Liu
    Signal, Image and Video Processing, 2024, 18 : 2323 - 2331
  • [47] FLIPREID: CLOSING THE GAP BETWEEN TRAINING AND INFERENCE IN PERSON RE-IDENTIFICATION
    Ni, Xingyang
    Rahtu, Esa
    PROCEEDINGS OF THE 2021 9TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP), 2021,
  • [48] Staged encoder training for cross-camera person re-identification
    Xu, Zhi
    Yang, Jiawei
    Liu, Yuxuan
    Zhao, Longyang
    Liu, Jiajia
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) : 2323 - 2331
  • [49] Cross-Modality Person Re-Identification with Generative Adversarial Training
    Dai, Pingyang
    Ji, Rongrong
    Wang, Haibin
    Wu, Qiong
    Huang, Yuyu
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 677 - 683
  • [50] End-to-end training of CNN ensembles for person re-identification
    Serbetci, Ayse
    Akgul, Yusuf Sinan
    PATTERN RECOGNITION, 2020, 104