Learning Semantic Polymorphic Mapping for Text-Based Person Retrieval

被引:1
|
作者
Li, Jiayi [1 ]
Jiang, Min [1 ]
Kong, Jun [2 ]
Tao, Xuefeng [2 ]
Luo, Xi [1 ]
机构
[1] Jiangnan Univ, Engn Res Ctr Intelligent Technol Healthcare, Minist Educ, Wuxi 214122, Peoples R China
[2] Jiangnan Univ, Key Lab Adv Proc Control Light Ind, Minist Educ, Wuxi 214122, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Semantics; Image reconstruction; Feature extraction; Task analysis; Cognition; Measurement; Representation learning; Text-based person retrieval; semantic polymorphism; implicit reasoning; modality alignment;
D O I
10.1109/TMM.2024.3410129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text-Based Person Retrieval (TBPR) aims to identify a particular individual within an extensive image gallery using text as the query. The principal challenge inherent in the TBPR task revolves around how to map cross-modal information to a potential common space and learn a generic representation. Previous methods have primarily focused on aligning singular text-image pairs, disregarding the inherent polymorphism within both images and natural language expressions for the same individual. Moreover, these methods have also ignored the impact of semantic polymorphism-based intra-modal data distribution on cross-modal matching. Recent methods employ cross-modal implicit information reconstruction to enhance inter-modal connections. However, the process of information reconstruction remains ambiguous. To address these issues, we propose the Learning Semantic Polymorphic Mapping (LSPM) framework, facilitated by the prowess of pre-trained cross-modal models. Firstly, to learn cross-modal information representations with better robustness, we design the Inter-modal Information Aggregation (Inter-IA) module to achieve cross-modal polymorphic mapping, fortifying the foundation of our information representations. Secondly, to attain a more concentrated intra-modal information representation based on semantic polymorphism, we design Intra-modal Information Aggregation (Intra-IA) module to further constrain the embeddings. Thirdly, to further explore the potential of cross-modal interactions within the model, we design the implicit reasoning module, Masked Information Guided Reconstruction (MIGR), with constraint guidance to elevate overall performance. Extensive experiments on both CUHK-PEDES and ICFG-PEDES datasets show that we achieve state-of-the-art results on Rank-1, mAP and mINP compared to existing methods.
引用
收藏
页码:10678 / 10691
页数:14
相关论文
共 50 条
  • [31] Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color
    Zhu, Aichun
    Wang, Zijie
    Xue, Jingyi
    Wan, Xili
    Jin, Jing
    Wang, Tian
    Snoussi, Hichem
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 15
  • [32] Improving embedding learning by virtual attribute decoupling for text-based person search
    Chengji Wang
    Zhiming Luo
    Yaojin Lin
    Shaozi Li
    Neural Computing and Applications, 2022, 34 : 5625 - 5647
  • [33] Text-based experiment retrieval in genomic databases
    Sener, Duygu Dede
    Ogul, Hasan
    Basak, Selen
    JOURNAL OF INFORMATION SCIENCE, 2024, 50 (05) : 1334 - 1344
  • [34] EFFECTS OF CENTRALITY ON RETRIEVAL OF TEXT-BASED CONCEPTS
    ALBRECHT, JE
    OBRIEN, EJ
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION, 1991, 17 (05) : 932 - 939
  • [35] Multi-granularity Separation Network for Text-Based Person Retrieval with Bidirectional Refinement Regularization
    Li, Shenshen
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 307 - 315
  • [36] Fine-grained semantic oriented embedding set alignment for text-based person search
    Zhao, Jiaqi
    Fu, Ao
    Zhou, Yong
    Du, Wen-liang
    Yao, Rui
    IMAGE AND VISION COMPUTING, 2024, 152
  • [37] Text-based Person Search in Full Images via Semantic-Driven Proposal Generation
    Zhang, Shizhou
    Cheng, De
    Luo, Wenlong
    Xing, Yinghui
    Long, Duo
    Li, Hao
    Niu, Kai
    Liang, Guoqiang
    Zhang, Yanning
    PROCEEDINGS OF THE 4TH INTERNATIONAL WORKSHOP ON HUMAN-CENTRIC MULTIMEDIA ANALYSIS, HCMA 2023, 2023, : 5 - 14
  • [38] VGSG: Vision-Guided Semantic-Group Network for Text-Based Person Search
    He, Shuting
    Luo, Hao
    Jiang, Wei
    Jiang, Xudong
    Ding, Henghui
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 163 - 176
  • [39] Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold
    Wang, Zijie
    Zhu, Aichun
    Xue, Jingyi
    Wan, Xili
    Liu, Chao
    Wang, Tian
    Li, Yifeng
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1984 - 1992
  • [40] A Scene Text-Based Image Retrieval System
    Thuy Ho
    Ngoc Ly
    2012 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2012, : 79 - 84