Cross-Modal Feature Fusion-Based Knowledge Transfer for Text-Based Person Search

被引:1
|
作者
You, Kaiyang [1 ,2 ]
Chen, Wenjing [3 ]
Wang, Chengji [1 ,2 ]
Sun, Hao [1 ,2 ]
Xie, Wei [1 ,2 ]
机构
[1] Cent China Normal Univ, Sch Comp Sci, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China
[2] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China
[3] Hubei Univ Technol, Sch Comp Sci, Wuhan 430068, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Knowledge transfer; Visualization; Transformers; Data mining; Task analysis; Sun; Text-based person search; knowledge imbalance; knowledge transfer; cross-modal fusion; TRANSFORMER;
D O I
10.1109/LSP.2024.3449222
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Text-based person search aims to retrieve corresponding images of person from a large gallery based on text descriptions. Existing methods strive to bridge the modality gap between images and texts and have made promising progress. However, these approaches disregard the knowledge imbalance between images and texts caused by the reporting bias. To resolve this issue, we present a cross-modal feature fusion-based knowledge transfer network to balance identity information between images and texts. First, we design an identity information emphasis module to enhance person-relevant information and suppress person-irrelevant information. Second, we design an intermediate modal-guided knowledge transfer module to balance the knowledge between images and texts. Experimental results on CUHK-PEDES, ICFG-PEDE, and RSTPReid datasets demonstrate that our method achieves state-of-the-art performance.
引用
收藏
页码:2230 / 2234
页数:5
相关论文
共 50 条
  • [21] Full-view salient feature mining and alignment for text-based person search
    Xie, Sheng
    Zhang, Canlong
    Ning, Enhao
    Li, Zhixin
    Wang, Zhiwen
    Wei, Chunrong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
  • [22] An Empirical Study of CLIP for Text-Based Person Search
    Cao, Min
    Bai, Yang
    Zeng, Ziyin
    Ye, Mang
    Zhang, Min
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 465 - 473
  • [23] ITF-WPI: Image and text based cross-modal feature fusion model for wolfberry pest recognition
    Dai, Guowei
    Fan, Jingchao
    Dewi, Christine
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 212
  • [24] Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems
    Pereira, Jose Costa
    Vasconcelos, Nuno
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 124 : 123 - 135
  • [25] Cross-modal retrieval based on multi-dimensional feature fusion hashing
    Ren, Dongxiao
    Xu, Weihua
    FRONTIERS IN PHYSICS, 2024, 12
  • [26] Multi-level Part-aware Feature Disentangling for Text-based Person Search
    Chen, Yuhao
    Zhang, Guoqing
    Zhang, Hongwei
    Zheng, Yuhui
    Lin, Weisi
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2801 - 2806
  • [27] Cross-modal Fusion-based Prior Correction for Road Detection in Off-road Environments
    Wang, Yuru
    Sun, Yi
    Li, Jian
    Shi, Meiping
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 12239 - 12246
  • [28] A Baseline Investigation: Transformer-based Cross-view Baseline for Text-based Person Search
    Zang, Xianghao
    Gao, Wei
    Li, Ge
    Fang, Han
    Ban, Chao
    He, Zhongjiang
    Sun, Hao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7737 - 7746
  • [29] Improving Cross-Modal Constraints: Text Attribute Person Search With Graph Attention Networks
    Yang, Xi
    Wang, Xiaoqi
    Yang, Dong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2493 - 2503
  • [30] Local-enhanced representation for text-based person search
    Zhang, Guoqing
    Chen, Yuhao
    Zheng, Yuhui
    Martin, Gaven
    Wang, Ruili
    PATTERN RECOGNITION, 2025, 161