Query expansion based on clustering and personalized information retrieval

被引:0
|
作者
Hamid Khalifi
Walid Cherif
Abderrahim El Qadi
Youssef Ghanou
机构
[1] Moulay Ismail University,TIM Team, High School of Technology
[2] National Institute of Statistics and Applied Economics,SI2M Laboratory
[3] Mohammed V University,High School of Technology
来源
关键词
Information retrieval; Personalized information retrieval; Automatic query completion; Clustering; Performance evaluation; Support vector machines;
D O I
暂无
中图分类号
学科分类号
摘要
Information retrieval systems are used to describe a variety of processes involving the delivery of information to people who need it. Although several mathematical approaches have been studied in order to formalize the main components of an information retrieval system: queries representation, information items representations and the retrieval process, such systems still face many difficulties to extract relevant information for users especially when the processed data are texts. This is due to the complex nature of text databases. Generally, an information retrieval system reformulates queries according to associations among information items before matching them to dataset items. In this sense, semantic relationships or machine learning techniques can be applied to refine the returned results. This paper presents a formal model to organize data, and a new search algorithm to browse it. It incorporates a natural language preprocessing stage, a statistical representation of short documents and queries and a machine learning model to select relevant results. We propose later in this paper two further optimizations that proved quite interesting and returned significantly satisfying results on two datasets in a reasonable computation time. The first optimization concerns queries expansions, while the second one concerns dataset restructuration. Thus, we formally evaluate the impact of each optimization by computing the performance of the information retrieval system with and without it; the highest reached recall and precision were 96.2% and 99.2%, respectively.
引用
收藏
页码:241 / 251
页数:10
相关论文
共 50 条
  • [1] Query expansion based on clustering and personalized information retrieval
    Khalifi, Hamid
    Cherif, Walid
    El Qadi, Abderrahim
    Ghanou, Youssef
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2019, 8 (02) : 241 - 251
  • [2] Clustering Algorithms for Query Expansion Based Information Retrieval
    Khennak, Ilyes
    Drias, Habiba
    Kechid, Amine
    Moulai, Hadjer
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, PT II, 2019, 11684 : 261 - 272
  • [3] Query Expansion for Personalized Cross-Language Information Retrieval
    Zhou, Dong
    Lawless, Seamus
    Liu, Jianxun
    Zhang, Sanrong
    Xu, Yu
    10TH INTERNATIONAL WORKSHOP ON SEMANTIC AND SOCIAL MEDIA ADAPTATION AND PERSONALIZATION SMAP 2015, 2015, : 18 - 22
  • [4] An information retrieval model based on query expansion
    Huang, Mingxuan
    Zhang, Shichao
    Yan, Xiaowei
    Huang, Faliang
    RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 217 - 221
  • [5] Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval
    Chang, CH
    Hsu, CC
    COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 621 - 623
  • [6] Parallel information retrieval with query expansion
    Chung, YJ
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (06) : 1593 - 1595
  • [7] Parallel information retrieval with query expansion
    Chung, Y
    APPLIED PARALLEL COMPUTING: ADVANCED SCIENTIFIC COMPUTING, 2002, 2367 : 195 - 202
  • [8] Parallel information retrieval with query expansion
    Chung, Y
    APPLIED PARALLEL COMPUTING: ADVANCED SCIENTIFIC COMPUTING, 2002, 2367 : 195 - 202
  • [9] Personalized query suggestion diversification in information retrieval
    Wanyu Chen
    Fei Cai
    Honghui Chen
    Maarten De Rijke
    Frontiers of Computer Science, 2020, 14
  • [10] Personalized query suggestion diversification in information retrieval
    Chen, Wanyu
    Cai, Fei
    Chen, Honghui
    De Rijke, Maarten
    FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (03)