Query Performance Prediction for Neural IR: Are We There Yet?

被引:6
|
作者
Faggioli, Guglielmo [1 ]
Formal, Thibault [2 ,3 ]
Marchesin, Stefano [1 ]
Clinchant, Stephane [2 ]
Ferro, Nicola [1 ]
Piwowarski, Benjamin [3 ,4 ]
机构
[1] Univ Padua, Padua, Italy
[2] Naver Labs Europe, Meylan, France
[3] Sorbonne Univ, ISIR, Paris, France
[4] CNRS, Paris, France
基金
欧盟地平线“2020”;
关键词
DIVERGENCE;
D O I
10.1007/978-3-031-28244-7_15
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Evaluation in Information Retrieval (IR) relies on post-hoc empirical procedures, which are time-consuming and expensive operations. To alleviate this, Query Performance Prediction (QPP) models have been developed to estimate the performance of a system without the need for human-made relevance judgements. Such models, usually relying on lexical features from queries and corpora, have been applied to traditional sparse IR methods - with various degrees of success. With the advent of neural IR and large Pre-trained Language Models, the retrieval paradigm has significantly shifted towards more semantic signals. In this work, we study and analyze to what extent current QPP models can predict the performance of such systems. Our experiments consider seven traditional bag-of-words and seven BERT-based IR approaches, as well as nineteen state-of-the-art QPPs evaluated on two collections, Deep Learning '19 and Robust '04. Our findings show that QPPs perform statistically significantly worse on neural IR systems. In settings where semantic signals are prominent (e.g., passage retrieval), their performance on neural models drops by as much as 10% compared to bagof-words approaches. On top of that, in lexical-oriented scenarios, QPPs fail to predict performance for neural IR systems on those queries where they differ from traditional approaches the most.
引用
收藏
页码:232 / 248
页数:17
相关论文
共 50 条
  • [21] Neural Demographic Prediction using Search Query
    Wu, Chuhan
    Wu, Fangzhao
    Liu, Junxin
    He, Shaojian
    Huang, Yongfeng
    Xie, Xing
    PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 654 - 662
  • [22] Speller Performance Prediction for Query Autocorrection
    Baytin, Alexey
    Galinskaya, Irina
    Panina, Marina
    Serdyukov, Pavel
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1821 - 1824
  • [23] Query performance prediction for microblog search
    Hasanain, Maram
    Elsayed, Tamer
    INFORMATION PROCESSING & MANAGEMENT, 2017, 53 (06) : 1320 - 1341
  • [24] When is Query Performance Prediction Effective?
    Hauff, Claudia
    Azzopardi, Leif
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 830 - 831
  • [25] Query Performance Prediction for Entity Retrieval
    Raviv, Hadas
    Kurland, Oren
    Carmel, David
    SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 1099 - 1102
  • [26] Groupwise Query Performance Prediction with BERT
    Chen, Xiaoyang
    He, Ben
    Sun, Le
    ADVANCES IN INFORMATION RETRIEVAL, PT II, 2022, 13186 : 64 - 74
  • [27] Genetic risk prediction models on myopia - Are we there yet?
    Tedja, M. S.
    Han, X.
    Verhoeven, V. J. M.
    Eriksson, N.
    Furlotte, N.
    Amin, N.
    Van Duijn, C. M.
    MacGregor, S.
    Klaver, C. C. W.
    ACTA OPHTHALMOLOGICA, 2019, 97 : 14 - 14
  • [28] Prediction of post TIPS hepatic encephalopathy: are we there yet?
    Sharma, Sanchit
    Chauhan, Ashish
    Saraya, Anoop
    HEPATOLOGY INTERNATIONAL, 2021, 15 (04) : 1027 - 1027
  • [29] A SINGLE ASSAY FOR DSA STRENGTH PREDICTION: ARE WE THERE YET?
    Metz, Jennifer L.
    Vega, Renato M.
    Kielek, Denise E.
    Jackson, Annette M.
    HUMAN IMMUNOLOGY, 2017, 78 : 111 - 111
  • [30] Are We There Yet? An Analysis of Delirium Risk Prediction Models
    Zalon, Margarete L.
    NURSING RESEARCH, 2015, 64 (02) : E125 - E126