A Learning to Rank framework applied to text-image retrieval

被引：0

作者：

David Buffoni

Sabrina Tollari

Patrick Gallinari

机构：

[1] Université Pierre et Marie CURIE - Paris 6 / LIP6,

来源：

Multimedia Tools and Applications | 2012年 / 60卷

关键词：

Learning to Rank; Text-image retrieval; OWPC; Visuo-textual fusion; Pooling for Learning to Rank;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present a framework based on a Learning to Rank setting for a text-image retrieval task. In Information Retrieval, the goal is to compute the similarity between a document and an user query. In the context of text-image retrieval where several similarities exist, human intervention is often needed to decide on the way to combine them. On the other hand, with the Learning to Rank approach the combination of the similarities is done automatically. Learning to Rank is a paradigm where the learnt objective function is able to produce a ranked list of images when a user query is given. These score functions are generally a combination of similarities between a document and a query. In the past, Learning to Rank algorithms were successfully applied to text retrieval where they outperformed baselines such as BM25 or TFIDF. This inspired us to apply our state-of-the-art algorithm, called OWPC (Usunier et al. 2009), to the text-image retrieval task. At this time, no benchmarks are available, therefore we present a framework for building one. The empirical validation of this algorithm is done on the dataset constructed through comparison of typical text-image retrieval similarities. In both cases, visual only and text and visual, our algorithm performs better than a simple baseline.

引用

页码：161 / 180

页数：19

共 50 条

[31] A Lightweight Multi-Scale Crossmodal Text-Image Retrieval Method in Remote Sensing
Yuan, Zhiqiang
Zhang, Wenkai
Rong, Xuee
Li, Xuan
Chen, Jialiang
Wang, Hongqi
Fu, Kun
Sun, Xian
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[32] Experimental text-image travel literature
Manghani, S
THEORY CULTURE & SOCIETY, 2003, 20 (03) : 127 - 138
[33] Exploring Uni-Modal Feature Learning on Entities and Relations for Remote Sensing Cross-Modal Text-Image Retrieval
Zhang, Shun
Li, Yupeng
Mei, Shaohui
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[34] Contrastive completing learning for practical text-image person ReID: Robuster and cheaper
Du, Guodong
Gong, Tiantian
Zhang, Liyan
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 248
[35] Text-image coupling for editing literary sources
Lecolinet E.
Robert L.
Role F.
Computers and the Humanities, 2002, 36 (1): : 49 - 73
[36] An End-to-End Framework Based on Vision-Language Fusion for Remote Sensing Cross-Modal Text-Image Retrieval
He, Liu
Liu, Shuyan
An, Ran
Zhuo, Yudong
Tao, Jian
MATHEMATICS, 2023, 11 (10)
[37] Text-Image Matching for Cross-Modal Remote Sensing Image Retrieval via Graph Neural Network
Yu, Hongfeng
Yao, Fanglong
Lu, Wanxuan
Liu, Nayu
Li, Peiguang
You, Hongjian
Sun, Xian
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 812 - 824
[38] Remote Sensing Cross-Modal Text-Image Retrieval Based on Attention Correction and Filtering
Yang, Xiaoyu
Li, Chao
Wang, Zhiming
Xie, Hao
Mao, Junyi
Yin, Guangqiang
REMOTE SENSING, 2025, 17 (03)
[39] BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval
Chen, Yinda
Liu, Che
Liu, Xiaoyu
Arcucci, Rossella
Xiong, Zhiwei
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XI, 2024, 15011 : 124 - 134
[40] Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images
Xu, Shicheng
Hou, Danyang
Pang, Liang
Deng, Jingcheng
Xu, Jun
Shen, Huawei
Cheng, Xueqi
PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 208 - 217

← 1 2 3 4 5 →