A Learning to Rank framework applied to text-image retrieval

被引：0

作者：

David Buffoni

Sabrina Tollari

Patrick Gallinari

机构：

[1] Université Pierre et Marie CURIE - Paris 6 / LIP6,

来源：

Multimedia Tools and Applications | 2012年 / 60卷

关键词：

Learning to Rank; Text-image retrieval; OWPC; Visuo-textual fusion; Pooling for Learning to Rank;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present a framework based on a Learning to Rank setting for a text-image retrieval task. In Information Retrieval, the goal is to compute the similarity between a document and an user query. In the context of text-image retrieval where several similarities exist, human intervention is often needed to decide on the way to combine them. On the other hand, with the Learning to Rank approach the combination of the similarities is done automatically. Learning to Rank is a paradigm where the learnt objective function is able to produce a ranked list of images when a user query is given. These score functions are generally a combination of similarities between a document and a query. In the past, Learning to Rank algorithms were successfully applied to text retrieval where they outperformed baselines such as BM25 or TFIDF. This inspired us to apply our state-of-the-art algorithm, called OWPC (Usunier et al. 2009), to the text-image retrieval task. At this time, no benchmarks are available, therefore we present a framework for building one. The empirical validation of this algorithm is done on the dataset constructed through comparison of typical text-image retrieval similarities. In both cases, visual only and text and visual, our algorithm performs better than a simple baseline.

引用

页码：161 / 180

页数：19

共 50 条

[21] A transfer learning based text-image feature mapping algorithm
Liu, Jie
Du, Jun-Ping
Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2012, 35 (06): : 1 - 5
[22] Transformer-Enhanced Visual-Semantic Representation for Text-Image Retrieval
Zhang, Meng
Wu, Wei
Zhang, Haotian
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 2042 - 2048
[23] CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
Wang, Zihao
Liu, Xihui
Li, Hongsheng
Sheng, Lu
Yan, Junjie
Wang, Xiaogang
Shao, Jing
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5763 - 5772
[24] Hypersphere-Based Remote Sensing Cross-Modal Text-Image Retrieval via Curriculum Learning
Zhang, Weihang
Li, Jihao
Li, Shuoke
Chen, Jialiang
Zhang, Wenkai
Gao, Xin
Sun, Xian
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[25] A Vision Enhanced Framework for Indonesian Multimodal Abstractive Text-Image Summarization
Song, Yutao
Lin, Nankai
Li, Lingbao
Jiang, Shengyi
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 61 - 66
[26] Fisher Linear Discriminant Analysis for text-image combination in multimedia information retrieval
Moulin, Christophe
Largeron, Christine
Ducottet, Christophe
Gery, Mathias
Barat, Cecile
PATTERN RECOGNITION, 2014, 47 (01) : 260 - 269
[27] TEXT-IMAGE ARTICULATION IN THE PROCESSING OF AN EXPOSITORY TEXT
GAONACH, D
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 578 - 578
[28] INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval
Wang, Kehao
Wang, Yuhui
Xue, Lian
Li, Qifeng
IEEE ACCESS, 2025, 13 : 1470 - 1480
[29] Uncertainty-aware coarse-to-fine alignment for text-image person retrieval
Yifei Deng
Zhengyu Chen
Chenglong Li
Jin Tang
Visual Intelligence, 2025, 3 (1):
[30] Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval
Ren, Siyu
Zhu, Kenny Q.
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4085 - 4090

← 1 2 3 4 5 →