Bi-Directional Dynamic Interaction Network for Cross-Modality Person Re-Identification

被引:0
|
作者
Zheng A. [1 ,2 ,3 ]
Feng M. [1 ,3 ]
Li C. [1 ,2 ,3 ]
Tang J. [1 ,3 ]
Luo B. [1 ,3 ]
机构
[1] Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Hefei
[2] School of Artificial Intelligence, Anhui University, Hefei
[3] School of Computer Science and Technology, Anhui University, Hefei
关键词
convolutional neural network; cross-layer multi-resolution; cross-modality; dynamic convolution; person re-identification;
D O I
10.3724/SP.J.1089.2023.19280
中图分类号
学科分类号
摘要
Current cross-modality person re-identification methods mainly use weight-sharing convolution kernels, which leads to poor dynamic adjustment ability of the model for different inputs. Meanwhile, they mainly use high-level coarse-resolution semantic features, which leads to great information loss. Therefore, this paper proposes a bi-directional dynamic interaction network for cross-modality person re-identification. Firstly, the global feature of different modalities after each residual block is extracted by the dual-flow network. Secondly, according to the global content of different modalities, it dynamically generates a customized convolution kernels to extract the modality-specific characteristics, followed by the integration of modality-complementary characteristics transferring between modalities to alleviate heterogeneity. Finally, the characteristics of different resolutions of each layer are modified to boost a more discriminative and robust characteristic representation. Experimental results on two benchmark RGB-infrared person Re-ID datasets, SYSUMM01 and RegDB demonstrate the effectiveness of the proposed method, which outperforms the state-of-the-art methods by 4.70% and 2.12% on R1 accuracy respectively, while 4.30% and 2.67% on mAP. © 2023 Institute of Computing Technology. All rights reserved.
引用
收藏
页码:371 / 382
页数:11
相关论文
共 54 条
  • [1] Hermans A, Beyer L, Leibe B., In defense of the triplet loss for person re-identification
  • [2] Suh Y, Wang J D, Tang S Y, Et al., Part-aligned bilinear representations for person re-identification, Proceedings of the European Conference on Computer Vision, pp. 418-437, (2018)
  • [3] Kostinger M, Hirzer M, Wohlhart P, Et al., Large scale metric learning from equivalence constraints, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2288-2295, (2012)
  • [4] Hariharan B, Arbelaez P, Girshick R, Et al., Hypercolumns for object segmentation and fine-grained localization, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 447-456, (2015)
  • [5] Wang Y, Wang L Q, You Y R, Et al., Resource aware person re-identification across multiple resolutions, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8042-8051, (2018)
  • [6] Qi L, Huo J, Wang L, Et al., A mask based deep ranking neural network for person retrieval, Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 496-501, (2019)
  • [7] Lan X, Zhu X T, Gong S G., Person search by multi-scale matching, Proceedings of the European Conference on Computer Vision, pp. 553-569, (2018)
  • [8] Zheng L, Yang Y, Hauptmann A G., Person re-identification: past, present and future
  • [9] Fang P F, Zhou J M, Roy S, Et al., Bilinear attention networks for person retrieval, Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8029-8038, (2019)
  • [10] Sun Y F, Zheng L, Yang Y, Et al., Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline), Proceedings of the European Conference on Computer Vision, pp. 501-518, (2018)