Cross-Modal Feature Fusion-Based Knowledge Transfer for Text-Based Person Search

被引：1

作者：

You, Kaiyang ^{[1
,2
]}

Chen, Wenjing ^{[3
]}

Wang, Chengji ^{[1
,2
]}

Sun, Hao ^{[1
,2
]}

Xie, Wei ^{[1
,2
]}

机构：

[1] Cent China Normal Univ, Sch Comp Sci, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China

[2] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China

[3] Hubei Univ Technol, Sch Comp Sci, Wuhan 430068, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2024年 / 31卷

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Knowledge transfer; Visualization; Transformers; Data mining; Task analysis; Sun; Text-based person search; knowledge imbalance; knowledge transfer; cross-modal fusion; TRANSFORMER;

D O I：

10.1109/LSP.2024.3449222

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Text-based person search aims to retrieve corresponding images of person from a large gallery based on text descriptions. Existing methods strive to bridge the modality gap between images and texts and have made promising progress. However, these approaches disregard the knowledge imbalance between images and texts caused by the reporting bias. To resolve this issue, we present a cross-modal feature fusion-based knowledge transfer network to balance identity information between images and texts. First, we design an identity information emphasis module to enhance person-relevant information and suppress person-irrelevant information. Second, we design an intermediate modal-guided knowledge transfer module to balance the knowledge between images and texts. Experimental results on CUHK-PEDES, ICFG-PEDE, and RSTPReid datasets demonstrate that our method achieves state-of-the-art performance.

引用

页码：2230 / 2234

页数：5

共 50 条

[21] Full-view salient feature mining and alignment for text-based person search
Xie, Sheng
Zhang, Canlong
Ning, Enhao
Li, Zhixin
Wang, Zhiwen
Wei, Chunrong
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
[22] An Empirical Study of CLIP for Text-Based Person Search
Cao, Min
Bai, Yang
Zeng, Ziyin
Ye, Mang
Zhang, Min
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 465 - 473
[23] ITF-WPI: Image and text based cross-modal feature fusion model for wolfberry pest recognition
Dai, Guowei
Fan, Jingchao
Dewi, Christine
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 212
[24] Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems
Pereira, Jose Costa
Vasconcelos, Nuno
COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 124 : 123 - 135
[25] Cross-modal retrieval based on multi-dimensional feature fusion hashing
Ren, Dongxiao
Xu, Weihua
FRONTIERS IN PHYSICS, 2024, 12
[26] Multi-level Part-aware Feature Disentangling for Text-based Person Search
Chen, Yuhao
Zhang, Guoqing
Zhang, Hongwei
Zheng, Yuhui
Lin, Weisi
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2801 - 2806
[27] Cross-modal Fusion-based Prior Correction for Road Detection in Off-road Environments
Wang, Yuru
Sun, Yi
Li, Jian
Shi, Meiping
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 12239 - 12246
[28] A Baseline Investigation: Transformer-based Cross-view Baseline for Text-based Person Search
Zang, Xianghao
Gao, Wei
Li, Ge
Fang, Han
Ban, Chao
He, Zhongjiang
Sun, Hao
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7737 - 7746
[29] Improving Cross-Modal Constraints: Text Attribute Person Search With Graph Attention Networks
Yang, Xi
Wang, Xiaoqi
Yang, Dong
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2493 - 2503
[30] Local-enhanced representation for text-based person search
Zhang, Guoqing
Chen, Yuhao
Zheng, Yuhui
Martin, Gaven
Wang, Ruili
PATTERN RECOGNITION, 2025, 161

← 1 2 3 4 5 →