Exploiting Multi-View Part-Wise Correlation via an Efficient Transformer for Vehicle Re-Identification

被引:18
|
作者
Li, Ming [1 ]
Liu, Jun [2 ]
Zheng, Ce [3 ]
Huang, Xinming [4 ]
Zhang, Ziming [4 ]
机构
[1] Natl Univ Singapore, Inst Data Sci, Singapore 119077, Singapore
[2] Singapore Univ Technol & Design, Informat Syst Technol & Design, Singapore 487372, Singapore
[3] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
[4] Worcester Polytech Inst, Dept Elect & Comp Engn, Worcester, MA 01609 USA
关键词
Transformers; Correlation; Feature extraction; Visualization; Training; Benchmark testing; Task analysis; Correlation exploiting; multi-view learning; transformer; vehicle re-identification;
D O I
10.1109/TMM.2021.3134839
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image-based vehicle re-identification (ReID) has witnessed much progress in recent years. However, most of existing works struggled to extract robust but discriminative features from a single image to represent one vehicle instance. We argue that images taken from distinct viewpoints, e.g., front and back, have significantly different appearances and patterns for recognition. In order to identify each vehicle, these models have to capture consistent "ID codes " from totally different views, causing learning difficulties. Additionally, we claim that part-level correspondences among views, i.e., various vehicle parts observed from the identical image and the same part visible from different viewpoints, contribute to instance-level feature learning as well. Motivated by these, we propose to extract comprehensive vehicle instance representations from multiple views through modelling part-wise correlations. To this end, we present our efficient transformer-based framework to exploit both inner- and inter-view correlations for vehicle ReID. In specific, we first adopt a convnet encoder to condense a series of patch embeddings from each view. Then our efficient transformer, consisting of a distillation token and a noise token in addition to a regular classification token, is constructed for enforcing these patch embeddings to interact with each other regardless of whether they are taken from identical or different views. We conduct extensive experiments on widely used vehicle ReID benchmarks, and our approach achieves the state-of-the-art performance, showing the effectiveness of our method.
引用
收藏
页码:919 / 929
页数:11
相关论文
共 50 条
  • [1] MULTI-VIEW LEARNING FOR VEHICLE RE-IDENTIFICATION
    Lin, Weipeng
    Li, Yidong
    Yang, Xiaoliang
    Peng, Peixi
    Xing, Junliang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 832 - 837
  • [2] Beyond the Parts: Learning Multi-view Cross-part Correlation for Vehicle Re-identification
    Liu, Xinchen
    Liu, Wu
    Zheng, Jinkai
    Yan, Chenggang
    Mei, Tao
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 907 - 915
  • [3] MULTI-VIEW VEHICLE IMAGE GENERATION NETWORK FOR VEHICLE RE-IDENTIFICATION
    Xun, Yizhe
    Liu, Jia
    Islam, Sardar M. N.
    Chen, Yuanfang
    2024 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS 2024, 2024, : 517 - 522
  • [4] Vehicle Re-Identification by Deep Hidden Multi-View Inference
    Zhou, Yi
    Liu, Li
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (07) : 3275 - 3287
  • [5] Multi-View Spatial Attention Embedding for Vehicle Re-Identification
    Teng, Shangzhi
    Zhang, Shiliang
    Huang, Qingming
    Sebe, Nicu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) : 816 - 827
  • [6] Large margin metric learning for multi-view vehicle re-identification
    Zhang, Shilin
    Lin, Cong
    Ma, Siming
    Neurocomputing, 2021, 447 : 118 - 128
  • [7] Large margin metric learning for multi-view vehicle re-identification
    Zhang, Shilin
    Lin, Cong
    Ma, Siming
    NEUROCOMPUTING, 2021, 447 : 118 - 128
  • [8] Viewpoint-aware Attentive Multi-view Inference for Vehicle Re-identification
    Zhou, Yi
    Shao, Ling
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : CP99 - CP99
  • [9] Multi-view feature fusion for person re-identification
    Xu, Yinsong
    Jiang, Zhuqing
    Men, Aidong
    Wang, Haiying
    Luo, Haiyong
    KNOWLEDGE-BASED SYSTEMS, 2021, 229
  • [10] MULTI-VIEW IMPLICIT TRANSFER FOR PERSON RE-IDENTIFICATION
    Xu, Wei
    Li, Yijun
    Gong, Chen
    Yang, Lie
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1151 - 1155