A Learning to Rank framework applied to text-image retrieval

被引:0
|
作者
David Buffoni
Sabrina Tollari
Patrick Gallinari
机构
[1] Université Pierre et Marie CURIE - Paris 6 / LIP6,
来源
关键词
Learning to Rank; Text-image retrieval; OWPC; Visuo-textual fusion; Pooling for Learning to Rank;
D O I
暂无
中图分类号
学科分类号
摘要
We present a framework based on a Learning to Rank setting for a text-image retrieval task. In Information Retrieval, the goal is to compute the similarity between a document and an user query. In the context of text-image retrieval where several similarities exist, human intervention is often needed to decide on the way to combine them. On the other hand, with the Learning to Rank approach the combination of the similarities is done automatically. Learning to Rank is a paradigm where the learnt objective function is able to produce a ranked list of images when a user query is given. These score functions are generally a combination of similarities between a document and a query. In the past, Learning to Rank algorithms were successfully applied to text retrieval where they outperformed baselines such as BM25 or TFIDF. This inspired us to apply our state-of-the-art algorithm, called OWPC (Usunier et al. 2009), to the text-image retrieval task. At this time, no benchmarks are available, therefore we present a framework for building one. The empirical validation of this algorithm is done on the dataset constructed through comparison of typical text-image retrieval similarities. In both cases, visual only and text and visual, our algorithm performs better than a simple baseline.
引用
收藏
页码:161 / 180
页数:19
相关论文
共 50 条
  • [31] A Lightweight Multi-Scale Crossmodal Text-Image Retrieval Method in Remote Sensing
    Yuan, Zhiqiang
    Zhang, Wenkai
    Rong, Xuee
    Li, Xuan
    Chen, Jialiang
    Wang, Hongqi
    Fu, Kun
    Sun, Xian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [32] Experimental text-image travel literature
    Manghani, S
    THEORY CULTURE & SOCIETY, 2003, 20 (03) : 127 - 138
  • [33] Exploring Uni-Modal Feature Learning on Entities and Relations for Remote Sensing Cross-Modal Text-Image Retrieval
    Zhang, Shun
    Li, Yupeng
    Mei, Shaohui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [34] Contrastive completing learning for practical text-image person ReID: Robuster and cheaper
    Du, Guodong
    Gong, Tiantian
    Zhang, Liyan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 248
  • [35] Text-image coupling for editing literary sources
    Lecolinet E.
    Robert L.
    Role F.
    Computers and the Humanities, 2002, 36 (1): : 49 - 73
  • [36] An End-to-End Framework Based on Vision-Language Fusion for Remote Sensing Cross-Modal Text-Image Retrieval
    He, Liu
    Liu, Shuyan
    An, Ran
    Zhuo, Yudong
    Tao, Jian
    MATHEMATICS, 2023, 11 (10)
  • [37] Text-Image Matching for Cross-Modal Remote Sensing Image Retrieval via Graph Neural Network
    Yu, Hongfeng
    Yao, Fanglong
    Lu, Wanxuan
    Liu, Nayu
    Li, Peiguang
    You, Hongjian
    Sun, Xian
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 812 - 824
  • [38] Remote Sensing Cross-Modal Text-Image Retrieval Based on Attention Correction and Filtering
    Yang, Xiaoyu
    Li, Chao
    Wang, Zhiming
    Xie, Hao
    Mao, Junyi
    Yin, Guangqiang
    REMOTE SENSING, 2025, 17 (03)
  • [39] BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval
    Chen, Yinda
    Liu, Che
    Liu, Xiaoyu
    Arcucci, Rossella
    Xiong, Zhiwei
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XI, 2024, 15011 : 124 - 134
  • [40] Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images
    Xu, Shicheng
    Hou, Danyang
    Pang, Liang
    Deng, Jingcheng
    Xu, Jun
    Shen, Huawei
    Cheng, Xueqi
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 208 - 217