A case study of distributed information retrieval architectures to index one terabyte of text

被引:16
|
作者
Cacheda, F
Plachouras, V
Ounis, I
机构
[1] Univ A Coruna, Fac Informat, Dept Informat & Commun Technol, La Coruna 15071, Spain
[2] Univ Glasgow, Dept Comp Sci, Glasgow G12 8QQ, Lanark, Scotland
基金
英国工程与自然科学研究理事会;
关键词
distributed information retrieval; performance; simulation;
D O I
10.1016/j.ipm.2004.05.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The increasing number of documents to be indexed in many environments (Web, intranets, digital libraries) and the limitations of a single centralised index (lack of scalability, server overloading and failures), lead to the use of distributed information retrieval systems to efficiently search and locate the desired information. This work is a case study of different architectures for a distributed information retrieval system, in order to provide a guide to approximate the optimal architecture with a specific set of resources. We analyse the effectiveness of a distributed, replicated and clustered architecture simulating a variable number of workstations (from I up to 4096). A collection of approximately 94 million documents and I terabyte (TB) of text is used to test the performance of the different architectures. In a purely distributed information retrieval system, the brokers become the bottleneck due to the high number of local answer sets to be sorted. In a replicated system, the network is the bottleneck due to the high number of query servers and the continuous data interchange with the brokers. Finally, we demonstrate that a clustered system will outperform a replicated system if a high number of query servers is used, essentially due to the reduction of the network load. However a change in the distribution of the users' queries could reduce the performance of a clustered system. (c) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1141 / 1161
页数:21
相关论文
共 50 条
  • [41] Measurement of Incompatible Probability in Information Retrieval:A Case Study with User Clicks
    王博
    侯越先
    Transactions of Tianjin University , 2013, (01) : 37 - 42
  • [42] Systematic Literature Review Supported by Information Retrieval Techniques: A Case Study
    Abilio, Ramon
    Oliveira, Claudiane
    Vale, Gustavo
    Morais, Flavio
    Pereira, Denilson
    Costa, Heitor
    PROCEEDINGS OF THE 2014 XL LATIN AMERICAN COMPUTING CONFERENCE (CLEI), 2014,
  • [43] THE COMPUTER AND INFORMATION-RETRIEVAL - SCHOOL LAW A CASE-STUDY
    ASHER, JW
    KURFEERST, M
    HARVARD EDUCATIONAL REVIEW, 1965, 35 (02) : 178 - 190
  • [44] Information retrieval in systematic reviews: a case study of the crime prevention literature
    Tompson, Lisa
    Belur, Jyoti
    JOURNAL OF EXPERIMENTAL CRIMINOLOGY, 2016, 12 (02) : 187 - 207
  • [45] Measurement of incompatible probability in information retrieval: A case study with user clicks
    Wang B.
    Hou Y.
    Transactions of Tianjin University, 2013, 19 (1) : 37 - 42
  • [46] Optimising Bayesian belief networks: A case study of information retrieval systems
    Indrawan, MT
    Srinivasan, B
    Wilson, CC
    Redpath, R
    1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 2273 - 2278
  • [47] Challenges in Geographically Distributed Information System Development: A Case Study
    Asp, Jali
    Taipalus, Toni
    Seppanen, Ville
    2021 IEEE 45TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2021), 2021, : 452 - 458
  • [48] An Agent Based System for Distributed Information Management: a case study
    Warnier, Martijn
    Timmer, Reinier
    Oey, Michel
    Brazier, Frances
    Oskamp, Anja
    2008 INTERNATIONAL MULTICONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (IMCSIT), VOLS 1 AND 2, 2008, : 46 - +
  • [49] Integration of a centralized and highly distributed information system: A case study
    Kranjcec, Denis
    Milicevic, Tanja
    PROCEEDINGS OF THE ITI 2008 30TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2008, : 527 - 532
  • [50] Information Architectures Definition - A Case Study in a Portuguese Local Public Administration Organization
    Sa, Filipe
    Rocha, Alvaro
    ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES, 2013, 206 : 399 - 410