Construction of query concepts based on feature clustering of documents

被引:0
|
作者
Youjin Chang
Minkoo Kim
Vijay V. Raghavan
机构
[1] Ajou University,Graduate School of Information and Communication
[2] Ajou University,Department of Information and Computer Engineering
[3] University of Louisiana,The Center for Advanced Computer Studies
来源
Information Retrieval | 2006年 / 9卷
关键词
concept-based information retrieval; query reformulation; query concepts;
D O I
暂无
中图分类号
学科分类号
摘要
In Information Retrieval, since it is hard to identify users’ information needs, many approaches have been tried to solve this problem by expanding initial queries and reweighting the terms in the expanded queries using users’ relevance judgments. Although relevance feedback is most effective when relevance information about retrieved documents is provided by users, it is not always available. Another solution is to use correlated terms for query expansion. The main problem with this approach is how to construct the term-term correlations that can be used effectively to improve retrieval performance. In this study, we try to construct query concepts that denote users’ information needs from a document space, rather than to reformulate initial queries using the term correlations and/or users’ relevance feedback. To form query concepts, we extract features from each document, and then cluster the features into primitive concepts that are then used to form query concepts. Experiments are performed on the Associated Press (AP) dataset taken from the TREC collection. The experimental evaluation shows that our proposed framework called QCM (Query Concept Method) outperforms baseline probabilistic retrieval model on TREC retrieval.
引用
收藏
页码:231 / 248
页数:17
相关论文
共 50 条
  • [31] Semantic based clustering of web documents
    Lin, TY
    Chiang, IJ
    2005 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2005, : 189 - 192
  • [32] Clustering template based web documents
    Gottron, Thomas
    ADVANCES IN INFORMATION RETRIEVAL, 2008, 4956 : 40 - 51
  • [33] Hierarchical Transformer-based Query by Multiple Documents
    Huang, Zhiqi
    Naseri, Shahrzad
    Bonab, Hamed
    Sarwar, Sheikh Muhammad
    Allan, James
    PROCEEDINGS OF THE 2023 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2023, 2023, : 105 - 115
  • [34] DOCUMENTS AS A BAG OF MAXIMAL SUBSTRINGS An Unsupervised Feature Extraction for Document Clustering
    Masada, Tomonari
    Shibata, Yuichiro
    Oguri, Kiyoshi
    ICEIS 2011: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1, 2011, : 5 - 13
  • [35] An Innovative Approach to classify and Retrieve Text Documents Using Feature Extraction and Hierarchical Clustering Based on Ontology
    Patil, Aradhana R.
    Manjrekar, Amrita A.
    2016 INTERNATIONAL CONFERENCE ON COMPUTING, ANALYTICS AND SECURITY TRENDS (CAST), 2016, : 371 - 376
  • [36] A decision support method, based on bounded rationality concepts, to reveal feature saliency in clustering problems
    Aviad, Barak
    Roy, Gelbard
    DECISION SUPPORT SYSTEMS, 2012, 54 (01) : 292 - 303
  • [37] Link-Based Clustering Algorithm for Clustering Web Documents
    Ashokkumar, P.
    Don, S.
    JOURNAL OF TESTING AND EVALUATION, 2019, 47 (06) : 4096 - 4107
  • [38] Search Result Clustering Based on Query Context
    Meina, Michal
    Nguyen, Hung Son
    FUNDAMENTA INFORMATICAE, 2015, 137 (02) : 273 - 290
  • [39] Continuous query scheduler based on operators clustering
    M.Sami Soliman
    谭冠政
    Journal of Central South University of Technology, 2011, 18 (03) : 782 - 790
  • [40] Continuous query scheduler based on operators clustering
    M. Sami Soliman
    Guan-zheng Tan
    Journal of Central South University, 2011, 18 : 782 - 790