Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification

被引:5
|
作者
Wan, Lin [1 ]
Jing, Qianyan [1 ]
Sun, Zongyuan [1 ]
Zhang, Chuang [2 ]
Li, Zhihang [3 ]
Chen, Yehansen [1 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430078, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
关键词
Task analysis; Training; Feature extraction; Lighting; Cameras; Visualization; Self-supervised learning; Cross-modality person re-identification; self-supervised learning; multi-modality pre-training;
D O I
10.1109/TIFS.2023.3273911
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
RGB-Infrared person re-identification (RGB-IR ReID) aims to associate people across disjoint RGB and IR camera views. Currently, state-of-the-art performance of RGB-IR ReID is not as impressive as that of conventional ReID. Much of that is due to the notorious modality bias training issue brought by the single-modality ImageNet pre-training, which might yield RGB-biased representations that severely hinder the cross-modality image retrieval. This paper makes first attempt to tackle the task from a pre-training perspective. We propose a self-supervised pre-training solution, named Modality-Aware Multiple Granularity Learning (MMGL), which directly trains models from scratch only on multi-modal ReID datasets, but achieving competitive results against ImageNet pre-training, without using any external data or sophisticated tuning tricks. First, we develop a simple-but-effective 'permutation recovery' pretext task that globally maps shuffled RGB-IR images into a shared latent permutation space, providing modality-invariant global representations for downstream ReID tasks. Second, we present a part-aware cycle-contrastive (PCC) learning strategy that utilizes cross-modality cycle-consistency to maximize agreement between semantically similar RGB-IR image patches. This enables contrastive learning for the unpaired multi-modal scenarios, further improving the discriminability of local features without laborious instance augmentation. Based on these designs, MMGL effectively alleviates the modality bias training problem. Extensive experiments demonstrate that it learns better representations (+8.03% Rank-1 accuracy) with faster training speed (converge only in few hours) and higher data efficiency (< 5% data size) than ImageNet pre-training. The results also suggest it generalizes well to various existing models, losses and has promising transferability across datasets. The code will be released at https://github.com/hansonchen1996/MMGL.
引用
收藏
页码:3044 / 3057
页数:14
相关论文
共 50 条
  • [31] Self-supervised recalibration network for person re-identification
    Shaoqi Hou
    Zhiming Wang
    Zhihua Dong
    Ye Li
    Zhiguo Wang
    Guangqiang Yin
    Xinzhong Wang
    Defence Technology, 2024, 31 (01) : 163 - 178
  • [32] Self-supervised data augmentation for person re-identification
    Chen, Feng
    Wang, Nian
    Tang, Jun
    Liang, Dong
    Feng, Hao
    NEUROCOMPUTING, 2020, 415 : 48 - 59
  • [33] Self-Supervised Gait Encoding with Locality-Aware Attention for Person Re-Identification
    Rao, Haocong
    Wang, Siqi
    Hu, Xiping
    Tan, Mingkui
    Da, Huang
    Cheng, Jun
    Hu, Bin
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 898 - 905
  • [34] Unified pre-training with pseudo infrared images for visible-infrared person re-identification
    ZhiGang Liu
    Yan Hu
    Multimedia Tools and Applications, 2024, 83 (38) : 86039 - 86058
  • [35] PersonMAE: Person Re-Identification Pre-Training With Masked AutoEncoders
    Hu, Hezhen
    Dong, Xiaoyi
    Bao, Jianmin
    Chen, Dongdong
    Yuan, Lu
    Chen, Dong
    Li, Houqiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10029 - 10040
  • [36] Tracklet Self-Supervised Learning for Unsupervised Person Re-Identification
    Wu, Guile
    Zhu, Xiatian
    Gong, Shaogang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12362 - 12369
  • [37] Bi-Directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification
    Ye, Hanrong
    Liu, Hong
    Meng, Fanyang
    Li, Xia
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1583 - 1595
  • [38] Person Re-Identification With Self-Supervised Teacher for In-Box Noise
    Seo, Yonghyeok
    Kim, Seung-Hun
    IEEE ACCESS, 2025, 13 : 39800 - 39812
  • [39] Unsupervised Person Re-Identification with Iterative Self-Supervised Domain Adaptation
    Tang, Haotian
    Zhao, Yiru
    Lu, Hongtao
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 1536 - 1543
  • [40] Distilled Camera-Aware Self Training for Semi-Supervised Person Re-Identification
    Wu, Ancong
    Zheng, Wei-Shi
    Lai, Jian-Huang
    IEEE ACCESS, 2019, 7 : 156752 - 156763