Multilateral Semantic Relations Modeling for Image Text Retrieval

被引:14
|
作者
Wang, Zheng [1 ,3 ]
Gaol, Zhenwei [1 ]
Guol, Kangshuai [1 ]
Yang, Yang [1 ]
Wang, Xiaorning [1 ]
Shen, Heng Tao [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Chengdu, Peoples R China
[2] Peng Cheng Lab, Shenzhen, Peoples R China
[3] UESTC Guangdong, Inst Elect & Informat Engn, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.00277
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image-text retrieval is a fundamental task to bridge vision and language by exploiting various strategies to fine-grained alignment between regions and words. This is still tough mainly because of one-to-many correspondence, where a set of matches from another modality can be accessed by a random query. While existing solutions to this problem including multi-point mapping, probabilistic distribution, and geometric embedding have made promising progress, one-to-many correspondence is still under-explored. In this work, we develop a Multilateral Semantic Relations Modeling (termed MSRM) for image-text retrieval to capture the one-to-many correspondence between multiple samples and a given query via hypergraph modeling. Specifically, a given query is first mapped as a probabilistic embedding to learn its true semantic distribution based on Mahalanobis distance. Then each candidate instance in a mini-batch is regarded as a hypergraph node with its mean semantics while a Gaussian query is modeled as a hyperedge to capture the semantic correlations beyond the pair between candidate points and the query. Comprehensive experimental results on two widely used datasets demonstrate that our MSRM method can outperform state-of-the-art methods in the settlement of multiple matches while still maintaining the comparable performance of instance-level matching.
引用
收藏
页码:2830 / 2839
页数:10
相关论文
共 50 条
  • [41] Learning Hierarchical Semantic Correspondences for Cross-Modal Image-Text Retrieval
    Zeng, Sheng
    Liu, Changhong
    Zhou, Jun
    Chen, Yong
    Jiang, Aiwen
    Li, Hanxi
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 239 - 248
  • [42] SAM: cross-modal semantic alignments module for image-text retrieval
    Pilseo Park
    Soojin Jang
    Yunsung Cho
    Youngbin Kim
    Multimedia Tools and Applications, 2024, 83 : 12363 - 12377
  • [43] Entity Semantic Feature Fusion Network for Remote Sensing Image-Text Retrieval
    Shui, Jianan
    Ding, Shuaipeng
    Li, Mingyong
    Ma, Yan
    WEB AND BIG DATA, APWEB-WAIM 2024, PT V, 2024, 14965 : 130 - 145
  • [44] Transformer-Enhanced Visual-Semantic Representation for Text-Image Retrieval
    Zhang, Meng
    Wu, Wei
    Zhang, Haotian
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 2042 - 2048
  • [45] Automatic image annotation for semantic image retrieval
    Shao, Wenbin
    Naghdy, Golshah
    Phung, Son Lam
    ADVANCES IN VISUAL INFORMATION SYSTEMS, 2007, 4781 : 369 - 378
  • [46] Semantic Image Analysis for Intelligent Image Retrieval
    Khodaskar, Anuja
    Ladhake, Siddarth
    INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015), 2015, 48 : 192 - 197
  • [47] Performance analysis of semantic indexing in text retrieval
    Kang, BY
    Kim, HJ
    Lee, SJ
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 433 - 436
  • [48] Information retrieval and text categorization with semantic indexing
    Rosso, P
    Molina, A
    Pla, F
    Jiménez, D
    Vidal, V
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 596 - 600
  • [49] <bold>Semantic Retrieval of Text Documents</bold>
    Klyuev, Vitaly
    Oleshchuk, Vladimir
    2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 189 - +
  • [50] Study on Text Semantic Similarity in Information Retrieval
    rong, Feng Shao
    jun, Xiao Wen
    2008 INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, VOLS 1-4, 2008, : 713 - 717