Towards more effective techniques for automatic query expansion

被引:0
|
作者
Carpineto, C [1 ]
Romano, G [1 ]
机构
[1] Fdn Ugo Bordoni, I-00142 Rome, Italy
来源
RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS | 1999年 / 1696卷
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Techniques for automatic query expansion from top retrieved documents have recently shown promise for improving retrieval effectiveness on large collections but there is still a lack of systematic evaluation and comparative studies. In this paper we focus on term-scoring methods based on the differences between the distribution of terms in (pseudo-)relevant documents and the distribution of terms in all documents, seen as a complement or an alternative to more conventional techniques. We show that when such distributional methods are used to select expansion terms within Rocchio's classical reweighting scheme, the overall performance is not likely to improve. However, we also show that when the same distributional methods are used to both select and weight expansion terms the retrieval effectiveness may considerably improve. We then argue, based on their variation in performance on individual queries, that the set of ranked terms suggested by individual distributional methods can be combined to further improve mean performance, by analogy with ensembling classifiers, and present experimental evidence supporting this view. Taken together, our experiments show that with automatic query expansion it is possible to achieve performance gains as high as 21.34% over non-expanded query (for non-interpolated average precision). We also discuss the effect that the main parameters involved in automatic query expansion, such as query difficulty, number of selected documents, and number of selected terms, have on retrieval effectiveness.
引用
收藏
页码:126 / 141
页数:16
相关论文
共 50 条
  • [41] Using query logs of USPTO patent examiners for automatic query expansion in patent searching
    Tannebaum, Wolfgang
    Rauber, Andreas
    INFORMATION RETRIEVAL, 2014, 17 (5-6): : 452 - 470
  • [42] Using query logs of USPTO patent examiners for automatic query expansion in patent searching
    Wolfgang Tannebaum
    Andreas Rauber
    Information Retrieval, 2014, 17 : 452 - 470
  • [43] Query Expansion via Wordnet for Effective Code Search
    Lu, Meili
    Sun, Xiaobing
    Wang, Shaowei
    Lo, David
    Duan, Yucong
    2015 22ND INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER), 2015, : 545 - 549
  • [44] How effective is query expansion for finding novel information?
    Zhang, M
    Lin, C
    Ma, SP
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 149 - 157
  • [45] Query Expansion with matrix correlation techniques - A systematic approach
    Blancalana, Claudio
    Lapolla, Antonello
    Micarelli, Alessandro
    WEBIST 2008: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 2, 2008, : 34 - 41
  • [46] Improving MEDLINE document retrieval using automatic query expansion
    Yoo, Sooyoung
    Choi, Jinwook
    ASIAN DIGITAL LIBRARIES: LOOKING BACK 10 YEARS AND FORGING NEW FRONTIERS, PROCEEDINGS, 2007, 4822 : 241 - 249
  • [47] Automatic query expansion via lexical-semantic relationships
    Greenberg, J
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2001, 52 (05): : 402 - 415
  • [48] Web query automatic expansion based on tolerance rough set
    Yi, GX
    Hu, HP
    2005 JOINT INTERNATIONAL CONFERENCE ON AUTONOMIC AND AUTONOMOUS SYSTEMS AND INTERNATIONAL CONFERENCE ON NETWORKING AND SERVICES (ICAS/ICNS), 2005, : 488 - 492
  • [49] Searching for explanatory Web pages using automatic query expansion
    Tauchi, Manabu
    Ward, Nigel
    COMPUTATIONAL INTELLIGENCE, 2007, 23 (01) : 3 - 14
  • [50] Collaborative feature location in models through automatic query expansion
    Perez, Francisca
    Font, Jaime
    Arcega, Lorena
    Cetina, Carlos
    AUTOMATED SOFTWARE ENGINEERING, 2019, 26 (01) : 161 - 202