Automatic Term Mismatch Diagnosis for Selective Query Expansion

被引:0
|
作者
Zhao, Le [1 ]
Callan, Jamie [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
Query term diagnosis; term mismatch; term expansion; Boolean conjunctive normal form queries; simulated user interactions;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
People are seldom aware that their search queries frequently mismatch a majority of the relevant documents. This may not be a big problem for topics with a large and diverse set of relevant documents, but would largely increase the chance of search failure for less popular search needs. We aim to address the mismatch problem by developing accurate and simple queries that require minimal effort to construct. This is achieved by targeting retrieval interventions at the query terms that are likely to mismatch relevant documents. For a given topic, the proportion of relevant documents that do not contain a term measures the probability for the term to mismatch relevant documents, or the term mismatch probability. Recent research demonstrates that this probability can be estimated reliably prior to retrieval. Typically, it is used in probabilistic retrieval models to provide query dependent term weights. This paper develops a new use: Automatic diagnosis of term mismatch. A search engine can use the diagnosis to suggest manual query reformulation, guide interactive query expansion, guide automatic query expansion, or motivate other responses. The research described here uses the diagnosis to guide interactive query expansion, and create Boolean conjunctive normal form (CNF) structured queries that selectively expand 'problem' query terms while leaving the rest of the query untouched. Experiments with TREC Ad-hoc and Legal Track datasets demonstrate that with high quality manual expansion, this diagnostic approach can reduce user effort by 33%, and produce simple and effective structured queries that surpass their bag of word counterparts.
引用
收藏
页码:515 / 524
页数:10
相关论文
共 50 条
  • [31] Using query logs of USPTO patent examiners for automatic query expansion in patent searching
    Wolfgang Tannebaum
    Andreas Rauber
    Information Retrieval, 2014, 17 : 452 - 470
  • [32] Improving MEDLINE document retrieval using automatic query expansion
    Yoo, Sooyoung
    Choi, Jinwook
    ASIAN DIGITAL LIBRARIES: LOOKING BACK 10 YEARS AND FORGING NEW FRONTIERS, PROCEEDINGS, 2007, 4822 : 241 - 249
  • [33] Automatic query expansion via lexical-semantic relationships
    Greenberg, J
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2001, 52 (05): : 402 - 415
  • [34] Web query automatic expansion based on tolerance rough set
    Yi, GX
    Hu, HP
    2005 JOINT INTERNATIONAL CONFERENCE ON AUTONOMIC AND AUTONOMOUS SYSTEMS AND INTERNATIONAL CONFERENCE ON NETWORKING AND SERVICES (ICAS/ICNS), 2005, : 488 - 492
  • [35] Searching for explanatory Web pages using automatic query expansion
    Tauchi, Manabu
    Ward, Nigel
    COMPUTATIONAL INTELLIGENCE, 2007, 23 (01) : 3 - 14
  • [36] Collaborative feature location in models through automatic query expansion
    Perez, Francisca
    Font, Jaime
    Arcega, Lorena
    Cetina, Carlos
    AUTOMATED SOFTWARE ENGINEERING, 2019, 26 (01) : 161 - 202
  • [37] Mapping Keywords to Linked Data Resources for Automatic Query Expansion
    Augenstein, Isabelle
    Gentile, Anna Lisa
    Norton, Barry
    Zhang, Ziqi
    Ciravegna, Fabio
    SEMANTIC WEB: ESWC 2013 SATELLITE EVENTS, 2013, 7955 : 101 - 112
  • [38] Combining WordNet and ConceptNet for automatic query expansion: A learning approach
    Hsu, Ming-Hung
    Tsai, Ming-Feng
    Chen, Hsin-Hsi
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 213 - +
  • [39] A Novel Web Query Automatic Expansion Based on Rough Set
    YI Gaoxiang~ 1
    Wuhan University Journal of Natural Sciences, 2006, (05) : 1167 - 1171
  • [40] Enhanced Web document retrieval using automatic query expansion
    Khan, MS
    Khor, S
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (01): : 29 - 40