A Learning to Rank framework applied to text-image retrieval

被引:0
|
作者
David Buffoni
Sabrina Tollari
Patrick Gallinari
机构
[1] Université Pierre et Marie CURIE - Paris 6 / LIP6,
来源
关键词
Learning to Rank; Text-image retrieval; OWPC; Visuo-textual fusion; Pooling for Learning to Rank;
D O I
暂无
中图分类号
学科分类号
摘要
We present a framework based on a Learning to Rank setting for a text-image retrieval task. In Information Retrieval, the goal is to compute the similarity between a document and an user query. In the context of text-image retrieval where several similarities exist, human intervention is often needed to decide on the way to combine them. On the other hand, with the Learning to Rank approach the combination of the similarities is done automatically. Learning to Rank is a paradigm where the learnt objective function is able to produce a ranked list of images when a user query is given. These score functions are generally a combination of similarities between a document and a query. In the past, Learning to Rank algorithms were successfully applied to text retrieval where they outperformed baselines such as BM25 or TFIDF. This inspired us to apply our state-of-the-art algorithm, called OWPC (Usunier et al. 2009), to the text-image retrieval task. At this time, no benchmarks are available, therefore we present a framework for building one. The empirical validation of this algorithm is done on the dataset constructed through comparison of typical text-image retrieval similarities. In both cases, visual only and text and visual, our algorithm performs better than a simple baseline.
引用
收藏
页码:161 / 180
页数:19
相关论文
共 50 条
  • [21] A transfer learning based text-image feature mapping algorithm
    Liu, Jie
    Du, Jun-Ping
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2012, 35 (06): : 1 - 5
  • [22] Transformer-Enhanced Visual-Semantic Representation for Text-Image Retrieval
    Zhang, Meng
    Wu, Wei
    Zhang, Haotian
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 2042 - 2048
  • [23] CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
    Wang, Zihao
    Liu, Xihui
    Li, Hongsheng
    Sheng, Lu
    Yan, Junjie
    Wang, Xiaogang
    Shao, Jing
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5763 - 5772
  • [24] Hypersphere-Based Remote Sensing Cross-Modal Text-Image Retrieval via Curriculum Learning
    Zhang, Weihang
    Li, Jihao
    Li, Shuoke
    Chen, Jialiang
    Zhang, Wenkai
    Gao, Xin
    Sun, Xian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [25] A Vision Enhanced Framework for Indonesian Multimodal Abstractive Text-Image Summarization
    Song, Yutao
    Lin, Nankai
    Li, Lingbao
    Jiang, Shengyi
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 61 - 66
  • [26] Fisher Linear Discriminant Analysis for text-image combination in multimedia information retrieval
    Moulin, Christophe
    Largeron, Christine
    Ducottet, Christophe
    Gery, Mathias
    Barat, Cecile
    PATTERN RECOGNITION, 2014, 47 (01) : 260 - 269
  • [27] TEXT-IMAGE ARTICULATION IN THE PROCESSING OF AN EXPOSITORY TEXT
    GAONACH, D
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 578 - 578
  • [28] INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval
    Wang, Kehao
    Wang, Yuhui
    Xue, Lian
    Li, Qifeng
    IEEE ACCESS, 2025, 13 : 1470 - 1480
  • [29] Uncertainty-aware coarse-to-fine alignment for text-image person retrieval
    Yifei Deng
    Zhengyu Chen
    Chenglong Li
    Jin Tang
    Visual Intelligence, 2025, 3 (1):
  • [30] Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval
    Ren, Siyu
    Zhu, Kenny Q.
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4085 - 4090